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Preface 


It was with great delight that I learned of the imminent publication of an 
English-language edition of my introductory course on mathematical analysis 
under the editorship of Dr. R. A. Silverman. Since the literature already includes 
many fine books devoted to the same general subject matter, I would like to take 
this opportunity to point out the special features of my approach. 

Mathematical analysis is a large “continent” concerned with the concepts of 
function, derivative, and integral. At present this continent consists of many 
“countries” such as differential equations (ordinary and partial), integral 
equations, functions of a complex variable, differential geometry, calculus of 
variations, etc. But even though the subject matter of mathematical analysis can 
be regarded as well-established, notable changes in its structure are still under 
way. In Goursat’s classical “cours d’analyse” of the twenties all of analysis is 
portrayed on a kind of “great plain,” on a single level of abstraction. In the books 
of our day, however, much attention is paid to the appearance in analysis of 
various “stages” of abstraction, i.e., to various “structures” (Bourbaki’s term) 
characterizing the mathematicological foundations of the original constructions. 
This emphasis on foundations clarifies the gist of the ideas involved, thereby 
freeing mathematics from concern with the idiosyncracies of each object under 
consideration. At the same time, an understanding of the nub of the matter 
allows one to take account immediately of new objects of a different individual 
nature but of exactly the same “structural depth.” 

Consider, for example, Picard’s proof of the existence and uniqueness of the 
solution of a differential equation in which the desired function is successively 
approximated on a given interval by other functions in accordance with certain 
rules. This proof had been known for some time when Banach and others 
formulated the “fixed point method.” The latter plainly reveals the nub of 
Picard’s proof, namely the presence of a contraction operator in a certain metric 
space. In this regard, the specific context of Picard’s problem, i.e., numerical 
functions on an interval, a differential equation, etc., turns out to be quite 
irrelevant. As a result, the fixed point method not only makes the “geometrical” 
proof of Picard’s theorem more transparent, but, by further developing the key 


idea of Picard’s proof, even leads to the proof of existence theorems involving 
neither functions on an interval nor a differential equation. Considerations of the 
same kind apply equally well to the geometry of Hilbert space, the study of 
differentiable functionals, and many other topics. 

Analysis presented from this point of view can be found, for example, in the 
superb books by J. Dieudonné. However it seems to me that Dieudonné’s books, 
for all their formal perfection, require that the reader’s “mathematical I. Q.” be 
too high. Thus, for my part, I have tried to accomodate the interests of a larger 
population of those concerned with mathematics. Therefore in many cases where 
Dieudonné instantly and almost miraculously produces deep classical results 
from general considerations, so that the reader can only take off his hat in silent 
admiration, the reader of my course is invited to climb with me from the 
foothills of elementary topics to successive levels of abstraction and then look 
down from above on the various valleys which now come into his field of view. 
Perhaps this approach is thornier, but in any event the mathematical traveler will 
thereby acquire the training needed for further exploration on his own. 

The present course begins with a systematic study of the real numbers, 
understood to be a set of objects satisfying certain definite axioms. There are 
other approaches to the theory of real numbers where things I take as axioms are 
proved, starting from set theory and the axioms for the natural numbers (for 
example, a rigorous treatment in this vein can be found in Landau’s famous 
course). Both treatments have a key deficiency, namely the absence of a proof of 
the compatibility of the axioms. Evidently modern mathematics lacks a 
construction of the real numbers which is free of this shortcoming. The whole 
question, far from being a mere technicality, involves the very foundations of 
mathematical thought. In any event, this being the case, it is really not very 
important where one starts a general treatment of analysis, and my choice is 
governed by the consideration that the starting point bear as close a resemblance 
as possible to analytic constructions proper. 

The concepts of a mathematical structure and an isomorphism are introduced 
in Chapter 2, after a brief digression on set theory, and a proof of the uniqueness 
of the structure of real numbers (to within an isomorphism) is given as an 
illustration. Two other structures are then introduced, namely n-dimensional 
space and the field of complex numbers. After a detailed treatment of metric 
spaces in Chapter 3, a general theory of limits is developed in Chapter 4. The 
starting point of this theory is taken to be on the one hand a set E equipped with 
a “direction,” 1.e., a system of subsets of E with an empty intersection (this 
notion, closely related to the “filters” of H. Cartan, is more restricted than that of 


a filter but is entirely adequate for the purposes of analysis), and on the other 
hand a function defined on EF taking values in a metric space. All the limits 
considered in analysis, from limits of a numerical sequence to the notions of the 
derivative and integral, are comprised in this scheme. Chapter 5 is concerned 
first with some theorems on continuous numerical functions on the real line and 
then with the use of functional equations to introduce the logarithm (from which 
the exponential is obtained by inversion) and the trigonometric functions. The 
algebra and topology of complex numbers and the fundamental theorem of 
algebra are presented as applications. Chapter 6 is on infinite series, dealing not 
only with numerical series but also with series whose terms are vectors and 
functions (including power series). Chapters 7 and 8 treat differential calculus 
proper, with Taylor’s series leading to a natural extension of real analysis into 
the complex domain. Chapter 9 presents the general theory of Riemann 
integration, together with a number of its applications. The further development 
of analysis requires the technique of analytic functions, which is considered in 
detail in Chapter 10. Finally Chapter 11 is devoted to improper integrals, and 
makes full use of the technique of analytic functions now at our disposal. 

Each chapter is equipped with a set of problems; hints and answers to most 
of these problems appear at the end of the book. To a certain extent, the 
problems help to develop necessary technical skill, but they are primarily 
intended to illustrate and amplify the material in the text. 


G.E.S. 


1 Real Numbers 


1.1. Set-Theoretic Preliminaries 
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1.11. Words like “aggregate,” “collection,” and “set” come up at once when 
talking about “objects” (or “elements”) of any kind. Thus one can talk about the 
set of students in an auditorium, the set of grains of sand on a beach, the set of 
vertices or the set of sides of a polygon, and so on. In each of these examples the 
set in question consists of a definite number of elements, which can be estimated 
within certain limits, even though it may be difficult in practice to find the 
number of elements exactly. Such sets are said to be finite. 

In mathematics one must often deal with sets consisting of a number of 
objects which is not finite. The simplest examples of such sets are the set 1, 2, 3, 
... of all natural numbers (positive integers) and the set of all the points on a line 
segment (precise definitions of these objects will be given later). Such sets are 
said to be infinite. To the category of sets we also assign the empty set, namely, 
the set containing no elements of all. 

As atule, sets will be denoted by large letters A, B, C, ... and elements of sets 
will be denoted by small letters. By a © A (or A ? a) we mean that a is an 
element of the set .4, while by a € A (or a © A) we mean that a is not an element 
of the set A. By A = B (or B > A) we mean that every element of the set A is an 
element of the set B; the set A is then said to be a subset of the set B. The largest 
subset of the set B is obviously the set B itself, while the smallest subset of B is 
the empty set. Any other subset of the set B, containing some but not all 
elements of B, is called a proper subset of B. The symbols € 2 = — are called 
inclusion relations. Suppose both A - B and B — A. Then every element of the 
set A is an element of the set B, and conversely, every element of the set B is an 
element of the set A. It follows that the sets 4 and B consist of precisely the same 
elements and hence coincide, a fact expressed by writing A = B. The analogous 
formula for elements, namely a = b, simply means that a and 5 are one and the 
same element. 

Sets can be specified in various ways. The simplest way is to write the 
elements of the set explicitly between curly brackets; e.g., A = {1, 2, 3, ...} is the 
set of all natural numbers (positive integers). Another way is to specify some 


property of the elements of the set; e.g., A = {x: x2 — 1 © 0} is the set of all x 


satisfying the inequality x2 — 1 < 0 written after the colon. 


1.12. Unions and intersections. Let 4, B, C, ... be given sets. Then the set of all 
elements of A, B, C, ... belonging to at least one of the sets A, B, C, ... is called 
the union of the sets A, B, C, ..., while the set of all elements of A, B, C, ... 
belonging to every one of the sets A, B, C, ... is called the intersection of the sets 
A, B, C, .... For example, let A = {6, 7, 8,...$, B= {3, 6,9, ...}, 1e., let A be the 
set of all natural numbers greater than 5 and B the set of all natural numbers 
divisible by 3. Then the union of A and B is the set S = {3, 6, 7, 8, 9, 10, ...}, 


the set of all natural numbers except 1, 2, 4, and 5, while the intersection of A 
and B is the set D= {6, 9, 12 ...}, 


the set of all natural numbers divisible by 3 except 3 itself. In the case where the 
sets A, B, C, ... have no elements in common, the intersection of A, B, C, ... is the 
empty set, and the sets A, B, C, ... are then said to be nonintersecting. For 
example, the three sets A= {1,2}, B= {2,3}, C= {1,3} 


are nonintersecting, even though any two of the sets share a common element. 

We can consider the union and intersection of both a finite number of sets and 
an infinite number of sets. For example, the union of the sets of points on all 
(infinitely many) lines in the plane passing through a given point O 1s clearly the 
set of all points in the plane, while the intersection of all these sets consists of 
the single point O. 

To denote the union S of given sets, say Ay, A), ..., A,, .... we use the symbols 
¥. or U, writing 

@ 


fa FA or S= |) A,, 
v=1 


v=1 


while to denote the intersection D of the sets, we use the symbols II or (), writing 


8) oo 
D=[]4, or D= ()4,. 


v=1 v=1 


1.2. Axioms for the Real Number System 


The following considerations stem from the simplest properties of numbers, 
known partly from everyday experience and partly from elementary 
mathematics.+ Rather than define the real numbers separately, we will define the 


whole set of real numbers at once as a set of elements equipped with certain 
operations and relations which in turn satisfy four groups of axioms. The first 
group consists of the addition axioms, the second of the multiplication axioms, 
the third of the order axioms, and the fourth of a single axiom called the Jeast 
upper bound axiom. 


Definition. By the real number system R is meant the set whose elements x, y, z, 
.. called real numbers, satisfy the four groups of axioms given in Secs. 
1.21—1.24. The set R is often called the real line, with its elements in turn called 
points. 


1.21. The addition axioms. To every pair of elements x and y in R there 
corresponds a (unique) element x + y, called the swum of x and y, where the rule 
associating x + y with x and y has the following properties: (a) x + y= y+ x for 
every x and y in R (addition is commutative); (b) (x + y) +z=x + (y + 2) for 
every x, y, Zin R (addition is associative); 

(c) R contains an element 0, called the zero element, such that x + 0 = x for every 
x in R; (d) For every x in R there exists an element y in R, called the negative of 
x, such that x + y = 0. 


1.22. The multiplication axioms. To every pair of elements x and y in R there 
corresponds a (unique) element x - y (or xy), called the product of x and y, where 
the rule associating xy with x and y has the following properties: (a) xy = yx for 
every x and y in R (multiplication is commutative); (b) (xy)z = x(yz) for every x, 
y, zn R (multiplication is associative);+ 

(c) R contains an element 1 # 0, called the unit element, such that 1 - x = x for 


every x in R; (d) For every x # 0 in R there exists an element u in R, called the 
reciprocal of x, such that xu = 1; (e) The formula 

xy 2) ay 32 

holds for every x, y, z in R (multiplication is distributive over addition). 


The last axiom connects the operation of multiplication with the operation of 
addition introduced in Sec. 1.21. 

A set of objects x, y, Z, ... satisfying the axioms of Secs. 1.21 and 1.22 is called 
a number field or simply a field. 


1.23. The order axioms. For every pair of elements x and y in R one (or both) of 
the relations x Sy (x is less than or equal to y) or y Sx holds, where S has the 


following properties: (a) x <x for every x in R, and x Sy, y Sx together imply x 

= y; (b) Ifx Sy, y Sz then x Sz. 

(c) Ifx Sy, thenx + z Sy +z for every z in R; (d) If 0 Sx, 0 Sy, then 0 Sxy. 
The relation x Sy can also be written in the form y 2 x (y is greater than or 

equal to x). If x Sy and x # y, we write x < y (x is less than y) or y 7 x (y is 

greater than x). 


1.24. A set E— R is said to be bounded from above if there exists an element z © 


R such that x Sz for every x © E, a fact expressed concisely by writing E S z. 
Every number z with the above property relative to a set E is called an upper 
bound of E. An upper bound zg of the set E is called the /east upper bound of E 


if every other upper bound z of £ is greater than or equal to zy (why is Z9 


unique?). The least upper bound of E is denoted by sup EF (from the Latin 
“supremum’’). We can now state the following Least upper bound axiom. 
Every set E— R which is bounded from above has a least upper bound. 


1.3. Consequences of the Addition Axioms 


Our next task is to deduce various implications of the above axioms which will 
be needed later. We start with certain consequences of the addition axioms. 


1.31. rHEorEM. The system R contains a unique zero element. 


Proof. Suppose R contains two zero elements 0, and 05. Then it follows from 
Axioms a and c of Sec. 1.21 that 0, =0,+0,=0,+0,=0;. Ih 


1.32. rHeorem. Every element x in R has a unique negative. 


Proof. Suppose x has two negatives y, and y5, so that x + y,) =x + y, = 0. Then it 
follows from Axioms a-—c of Sec. 1.21 that yy = yy +0 = y, + (x +y)) =. + x)4y 


=y, +(ety)=y,+0=y,. FT 


The negative of the element x is denoted by —x. The sum x + (—y), written 
more concisely as x — y, is called the difference of x and y. The negative —(x + y) 
of the sum x + y is the sum of the negatives of x and y, since x +y-—x-y=x-x 
+ty—y=0+0=0. 


133. THEOREM. The equation 
a+x=b (1) 


has a unique solution in R, equal to b — a. 


Proof. Adding the number —a to both sides of (1) and using Axioms a-—c of Sec. 
1.21, we find thata+x-a=x+a-a=x+0=x=b~—a, so that the solution, if 
it exists, equals b — a. But b — ais a solution, since a+ (b-a)=a+bh+(-a)=b 
+a+(-a)=b+0=5. I 


1.4. Consequences of the Multiplication Axioms 


1.41. a. rHeorem. The system R contains a unique unit element. 


Proof. Suppose R contains two unit elements 1, and 1,. Then it follows from 
Axiom a of Sec. 1.22 that 1,=1,-1,=1,-:1,=1,. Tf 


b. rororem. Every element x # 0 in R has a unique reciprocal. 


Proof. Suppose x has two reciprocals u, and uy, so that xu, = xu, = 1. Then it 
follows from Axioms a—c of Sec. 1.22 that U2= 1*u2z = (xu; )u, =x(u,u2) =x(uzu;) 
=(xu,)u,=l-u,=u,. J 


1.42. The reciprocal of the element x is denoted by 1/x. The reciprocal 1/xy of 
the product xy is the product of the reciprocals of x and y, since 
“ies =x-y~=]-1=!]. 

xy x y 

The product x - 1/z, written more concisely as x/z, 1s called the quotient (or 
ratio) of x and z. 


1.43. Definition. The numbers 1,2=1+1,3=2+1,..,n=(n-1)+1.,... 


are called natural numbers (or positive integers). Thus the set of natural numbers 
can be defined as the smallest numerical set containing the number 1 and 
containing the number 1 + 1 whenever it contains the natural number n. 

In many problems one must show that some numerical set A (e.g., the set of 
all natural numbers n for which some property 7,,, depending on 1, is valid) 


contains a// natural numbers. The method of mathematical induction, used in 
such problems, consists in verifying that (1) A contains 1; 

(2) If A contains a natural number n, then A also contains n + 1. 

If these two conditions are satisfied, it is clear from the foregoing that A contains 


all natural numbers. Thus the method of mathematical induction is a 
consequence of the very definition of the natural numbers. 


1.44. a. By the integers we mean the natural numbers together with their 
negatives and the number zero. 


b. Let m be an integer. Then the integer 2m is said to be even, while the integer 
2m + 1 is said to be odd. 


c. By the rational numbers we mean all quotients of the form m/n, where m and 
n are integers and n # 0. 


d. All other real numbers are said to be irrational. 


1.45. THEOREM. The equation 
ax=b (a0) (1) 


has a unique solution in R, equal to bla. 


Proof. Dividing both sides of (1) by a and using Axioms a—c of Sec. 1.31, we 
find that | l b 

—(ax) =( -a jx=l]-x=x=-, 

a a a 


so that the solution, if it exists, equals b/a. But b/a is a solution, since 


ae mab = («<)> 1-b=5. J 
a a a 


1.46. By definition, 


KU SKK (n==1,2,...), 


n times 


and hence obviously 


a a ane GP ax” (2) 
for arbitrary m, n = 1, 2, .... The expression x” is called the “th power of x.” To 
define x” for arbitrary integers, we set ,o_, ,-»_! 

‘ = 


for arbitrary x # 0. We now verify that the formulas (2) continue to hold for 
arbitrary integers m and n. Suppose first that m 7 0, n = -q < 0. Then 


x" = yn = yaa t = "Ta yin 
x? xt 


ett = mt am l 1 1 | 


if g Sm, while 7 
x" Po of a Bin 


if g 7 n, by Sec. 1.42. On the other hand, if m = —p < 0, n = -q < 0, then 


m _— +n 


again by Sec. 1.42. The second of the formulas (2) is proved similarly. 
1.47. a. rHeoreM. The formula 0: x =0 

holds for every x in R. 

Proof. Clearly 


O-xt+1-x=(0+1)x=1-+-x=x,0-x+1-x=0'x+x, 


and hence 


x=O:x+4+x. 


But then 
Oex%=x— x= 0, 
by Theorem 1.33. [t+ 


It follows that 0 has no inverse, since the equation 0 - x = 1 is impossible. This 
justifies the high school rule: “Don’t divide by zero.” 


i = ] l ] 
b. On the other hand, if xy Dandx#0, then ya2( )9=2(o) = 70-0. 


—Xx 
x 


Thus, if a product vanishes, so does at least one of its factors. 


1.48. THEOREM. If u # 0, v # 0, then * ‘ y_ svt yu 
ued uv 


Proof. Merely note that 


ih a a Da 
uv uv uu uv 
Ge) +e) 
=(-x}){(-v}+(—-y](-u 
u v v u 
l ] xy 
={—-x)-1+ -y -J=-+2 { 
v v 
1.49. THEOREM. The formula 
—x=(-—1)x (3) 
holds for every x in R. 


Proof.+ By Theorem 1.47a, (—1) x +x = [(—1) + 1]x =0 - x =0, which implies (3) 
i 


The results proved in Secs. 1.3 and 1.4 guarantee that the real numbers satisfy 
all the familiar identities of elementary algebra (like the binomial theorem, the 
formulas for the sums of arithmetic and geometric progressions, and various 
formulas involving determinants). 


1.5. Consequences of the Order Axioms 


1.51. Order and addition 
a. cemma. If x Sy, y Sz, and x =z, then x =y =z. 


Proof. Since y Sz =x, we have y Sx and hence y = x by Axiom a of Sec. 1.23. 


b. vemma. ¢ [fx Sy, y Sz, then x Sz. Similarly, ifx Sy, y Sz, then x <z. 


Proof. An immediate consequence of Lemma 1.5la. ff 


c. tHeorEM. The following inequalities are equivalent: 0 Sy, 0 Sy —x, —y S-x, x 
—yS0. 

Proof. Adding —x to both sides of the first inequality and invoking Axiom c of 
Sec. 1.23, we get the second inequality. Similarly, adding —y to both sides of the 
second inequality, we get the third, adding x to both sides of the third inequality, 
we get the fourth, and, finally, adding y to both sides of the fourth inequality, we 
get back the first. ff 


d. vemma. If x < y, thenx +z Sy +z for every z in R. 


Proof. Clearly x < y implies x S y and hence x + z Sy +z. Butifx+z=y+tz, 
then, adding —z to both sides, we get x = y which contradicts x < y. It follows 
thatx+zSytz. 


e. rHeorem. If xX) Sy, ...5 xX, Sp, then x, ++ +x, Sy, +++ + y,, where 


eae +x,“y, PY if x; Sy; for at least one pair x;, y;. 


Proof. By Axiom c of Sec. | ee 
Bybee FX, SI, HX2F °° $4, 


KI td2 te tHyS SI tty 


where, by the Lemma 1.51d, if x; = y; for at least one pair x;, y;, then the sign = 
appears instead of S in the appropriate place and in all subsequent places as well, 
by the same lemma. 


Thus inequalities “going in the same direction” can be added. In particular, x, 
<0, ..., a S 0 implies s = Dean aad <0, where s < 0, if x; < 0 for at least one 
j. A similar result holds if < is replaced by 2 and < by 7. 


f. reoreM. The following inequalities are equivalent: x < y,0 <y—x,-y S—x,x 
< 
y~0. 


Proof. This follows from Lemma 1.51d, in the same way that Theorem 1.51c 
follows from Axiom c of Sec. 1.23. ff 


1.52. Definition. A real number x is said to be nonnegative if x 2 0, positive if x 
> 0, nonpositive if x <0, and negative if x < 0. 
Thus the number 0 is simultaneously nonnegative and nonpositive. 


1.53. Definition. Given two real numbers x and y, suppose x S y, say. Then x is 
called the minimum of the numbers x and y, and we write x = min{x, y}. By the 
same token, y is called the maximum of the numbers x and y, and we write y = 
max {x, y). Using induction, we can define min and max {x, ..., x,} for any 
finite set of numbers x), ..., x,, for example, by setting max{x), ..., x, = max 


{max {X1, ...,X,—1},X,$ 


The number 
|x| = max {x, —x} 


is called the absolute value or modulus of the number x. Thus |x| = x if x 2 0, 
while |x| = —x if x S0. The number |x| is nonnegative for every x, and |—x| = |x}. 


1.54. a. rHeorem. If a 7 0, the inequality |x| Sa is equivalent to the inequalities 


xSa, —*x<a. (1) 


Proof. Obviously x Sa if x <0, while —x S a if x 2 0. Moreover, x = |x| Sa if x 
2 0, while —x |x| Saifx <0. I 


Noting that —x <a is equivalent to -a Sx (by Theorem 1.51c), we can write 


(1) in the form 
—acgxga. 4 

b. THEOREM. The equation 

[x+y] < |x] +] (2) 


holds for arbitrary real numbers x and y. 


Proof. If x and y are both nonnegative or both nonpositive, the inequality follows 
at once from the definition of the absolute value. If, say, x 2 0), y S0, then x + y 
Sx Sx+pl=bi +p, -~-y Sy =| Sb + V1, so that 


b+ y|=max {xty,-x-y}Sp|+p). Fl 

c. Applying induction to (2), we get |x, +... +x,| Sly] +... + Lx. 
1.55. Order and multiplication 

a. tema. [fx 7 0, y7 0, then xy? 0. 

Proof. Use Axiom d of Sec. 1.23, together with Sec. 1.47b. Il 
b. tHeorem. [fx Sy, z7 0, then xz Syz. 

Proof. We need only note that 

yz —xz=(y—x)z2 0, by Axiom d of Sec. 1.23. If 


c. It follows from Sec. 1.47b that S can be changed to < in both places in 


Theorem 1.55b. 
d. In particular, x2 <x if 0 <x <1, while x27 xifx7 1. 


e. If 0 <x Sy, 0 Sz Su, then xz Syz < yu, so that inequalities can be multiplied 
under these conditions. 


f. In particular, if0 <x <y, then x2 Sy?, ..., x" Sy". 


g. rHeorem. If x SO, y 2 0, then xy S0, while if x <0, y S0, then xy 2 0. 


Proof. In the first case —x 2 0. Hence, by Axiom d of Sec. 1.23 and Theorem 
1.49, (—x)y = (-1) xy = -(xy) 2 0, which implies xy S 0. In the second case —y 2 
0, and hence —xy S0, xy 2 0, by the first result. I 


Theorem 1.55g remains true if the signs S and 2 are replaced everywhere by 
< > 
and ~. 


h. In particular, x? =x - x 7 0 for all x # 0. It follows that 1 = 1 - 1 7 0 and then 
from Lemma 1.51d that2=1+171+0=1,3=2+17 2, etc. 


i. rneorem. The formula |xy| = |x||y| holds for all x and y. 


Proof. lf x 2 0, y 2 0, then |x| =x, |y| =y and xy 2 0, by Axiom d of Sec. 1.23, so 
that |xy| = xy = |x| |y|. Similarly, if x <0, y S0, then |x| = —x, |y| =—y and. xy SO, 
by Theorem 1.55g, so that |xy| = xy = (-x)(-y) = [xlly|. If x 2 0, y $0, then |x| =x 
\y| = —y and xy S 0, by Theorem 1.55g again, so that |xy| = —xy = (-x) (-y) = 
x|ly|. fx 2 0, y <0, then |x| =x, |v] = —y and xy $0, by Theorem 1.55g again, so 
that |xy| =—xy =x (—y) = |x| |y| and similarly for the case x <0, y 2 0. 

1.56. tHeorem. [fx 7 0, then 1/x 7 0. Moreover 0 <x < y implies O< 1 < 1 . 

a x 


Proof. The first assertion follows from 


gh hints 
oe 


and Theorem 1.55g (in its stronger form). To prove the second assertion, 
multiply 0 <x Sy by 1/y. 


In particular, all rational numbers of the form p/q, where p and gq are natural 
numbers, are positive. 


1.57. The following principle is often used in proofs: rHrorem. [f a number z is 
nonnegative and less than every positive number, then z = 0. 


Proof. If z 7 0, then, by hypothesis, z< z, which is impossible. ff 


1.6. Consequences of the Least Upper Bound Axiom 


1.61.+ A set E= R is said to be bounded from below if there exists an element z 
© R such that z Sx for every x © E, a fact expressed concisely by writing z S E. 
Every number z with the above property relative to a set EF is called a lower 
bound of E. If E is bounded from above, 1.e., if there exists a number z such that 
E Sz, then —E (the set of all numbers —x with x © £) is bounded from below, 
since x Sz implies —z S -x. In particular, —z is a lower bound of the set —E. 
Conversely, if E is bounded from below by a number z, the same argument 
shows that —E is bounded from above by the number —z. 

Suppose the set F is bounded from below. Then a lower bound Zp of F is 


called the greatest lower bound of E if every other lower bound z of E is less 
than or equal to Zp) (why is z) unique?). The greatest lower bound of E is denoted 


by inf EF (from the Latin “infimum ’’). 


tHeorem. Every set E — R which is bounded from below has a greatest lower 
bound, equal to —sup (—E). 


Proof. The set —E is bounded from above and hence has a least upper bound € = 
sup(—E), by the axiom of Sec. 1.24. If x © £, then —x S & and hence —é Sx, ice., 
—é is a lower bound of the set E. Let 7 be any other lower bound of EF. Then —7 is 
an upper bound of —£, and hence, by the definition of the least upper bound, —7 
7 sup(—E) = € or equivalently 7 <—<¢. In other words, inf E exists and equals —¢ 


= —sup(-£). 


1.62. a. THeoreEM. Jf the sets E and F are bounded from above and E — F, then 
supE S supF, while if E and F are bounded from below and E — F, then inf F S 
inf E. 


Proof. In the first case, sup F' is an upper bound for F and hence all the more so 
for E— F, so that sup E Ssup F. In the second case, inf F is a lower bound for F 
and hence all the more so for E — F, so that inf F SE. 


b. tHrorem. [fx S y for arbitrary x © E, y © F, then E is bounded from above, F is 


bounded from below, and sup E © inf F. 


Proof. The set E is bounded from above by any y © F. Hence sup E exists, and 
sup E Sy for any y © F. It follows that F is bounded from below by the number 
sup E, and hence that sup ESinf F. ff 


1.63. Next we prove the existence and uniqueness of the “nth root” of any 
positive number. 


THEOREM. Given any real x 7 0 and integer n 7 0, there exists a unique nth root of 
x, i.e, anumber y7 0 such that y" =x. 


Proof (after W. Rudin). Let A be the set of all positive z such that z” S x. 


Then 4 is bounded from above (by 1 if x <1 and by x if x 2 1). Let 
y=sup A. (1) 


We now show that y” = x, thereby proving the theorem. 
Suppose y? < x and let x — y” = e. Then, by the binomial theorem, 
—1 
(y+h)"=p"+ny""'h+ se - ) yr 2p? 4 


a 


<7 tH + ———— see yr +e. r =y"+h[(1+y)"—y"] 
€ 


y)"—y 


for any positive h S 1. Choosing , - ___* __ at 


we get (y +h)” Sy" + ¢ =x, which contradicts the definition (1). Therefore y” 2 
x a yo = x» and let y? - x =  «. Then 
(y—h)" =y"— ry" n+” n(n— Te 
2 
fe = aa a 2h4.. | 
> y "hl n yy net Cz 2h4.. | 


mle + yt tela AL +9)" —9"] 


é 


for any positive h S 1. Again choosing , - —_®& __, 
(T+3)"—9" 


we get (y — h)" 2 y" — € = x, which again contradicts (1). It follows that y” = x, as 
asserted. The uniqueness of the nth root follows from the inequality Ji < ¥2 
implied by y, < y> (cf. Sec. 1.55f). fl 


The nth root of x will henceforth be denoted by %/x. 


1.64. ==  ——— THEOREM. The formula 
Voa/xVy (2) 


holds for arbitrary positive x and y. 

Proof. Let &= */x,.n= */y, t= */ xy. Since ¢” = x, 7” =y, we have (¢7)" = ¢"n" 
=xy = 7". But then t=</xy=€n, 

by the uniqueness of the nth root. fi 


Similarly, it can be shown that 


Jinn @) 


for arbitray x 7 0 and integers m,n 7 1. 

1.65. Suppose n is an even integer. Then (—x)" = (-1)"x" = x" 7 0 for all x 4 0, so 
that the equation y” = x 7 0 has both a positive solution y= /x and a negative 
solution y, = K/x, while the equation y’ = x <0 has no real solutions at all. On 


the other hand, if n is an odd integer, the equation y’ = x 7 0 has a unique real 
solution y = S/x and the equation y” = x <0 also has a unique solution, namely y 
= — SF Le. 

Formulas (2) and (3) guarantee the validity for the real numbers of all the 
usual elementary algebraic results involving radicals (like the formulas for the 
solution of quadratic and cubic equations). 


1.66. A set E= R is said to be bounded from both sides, or simply bounded, if it 
is bounded both from above and from below. Every bounded set F has both a 
least upper bound sup EF and a greatest lower bound inf E. The following are 
particularly important examples of bounded sets: 1.67. Intervals. The set of all 


real numbers x satisfying the inequality a <x Sb is denoted by [a, b] and called 
a closed interval, with left-hand endpoint a and right-hand end point b (it is 
assumed that a <b). The set of all real numbers x satisfying the inequality a < x 
< b is denoted by (a, b) and called an open interval, again with left-hand end 
point a and right-hand end point 5. Thus the end points of a closed interval 
belong to the interval itself, while the end points of an open interval do not. In 
any event, sup[a, b] = sup(a, b) = b,infla, b] = inf(a, b) = a. 


It is also convenient to introduce “half-closed” and “half-open” intervals. 
Thus the sets {x: a <x <b} = (a, b] and {x: a Sx Sb} = [a, b) are both called 
intervals, the former half-open on the left and half-closed on the right, the latter 
half-closed on the left and half-open on the right. For completeness, we will 
sometimes regard a single point a as a closed interval, writing {a} = [a, a] = {x: 


a&x Sq}. 


1.7. The Principle of Archimedes and Its Consequences 1.71. 
THEOREM (Principle of Archimedes) + Given arbitrary real numbers 
x7 Oand y, there exists an integer n such that (n—1)x Sy < nx. 


Proof. Suppose px S y for every integer p. Then the set A of all numbers px is 
bounded from above, with y as an upper bound. It follows from the axiom of 
Sec. 1.24 that 4 has a least upper bound & = sup A. Since the number € — x < é is 
no longer an upper bound of A, there exists an integer p such that px 7 €- x. But 
then (p + 1)x 7 &, so that € cannot be an upper bound of A. This contradiction 
proves the existence of an integer p such that px 7 y. An analogous argument 
involving lower bounds shows that there exists an integer g such that gx < y, 
where clearly g Sp. Examining all the pairs (g, g + 1), (¢+1,q +2), ....(p—-1, 
p), we find one among them, say (n — 1, 1), such that (n — 1) x Sy while y < nx. 


In particular, choosing x = 1, we find that there exists an integer 1 such that n 
— 1 Sy <n for any given y © R. The number n — 1 is called the integral part of y 
and is denoted by [y], while the number y — [y] is called the fractional part of y 
and is denoted by (y). Thus every number y is the sum y = [y] + (y) 


of its integral and fractional parts. 


1.72. tHeorEm. Given arbitrary real numbers x 7 1, y 7 0, there exists an integer 


n such that x"~! Sy Sx, 


Proof. This is the “multiplicative version” of the principle of Archimedes, and is 
proved by going over from integral multiplest of x to integral powers of x (see 
Sec. 1.46) in the proof of Theorem 1.71 (give the details). I 


1.73. tHeorEM. Given arbitrary real numbers x 7 0 and y 7 0, there exists an 
> 


integer n 0 such that 
Pex. (1) 
n 
Proof. As in Theorem 1.71, y “ nx, where now y and 
n> (2) 
x 


are positive. Multiplying both sides of (2) by x/n, we get (1). fl 


In particular, it follows that 
inf 12: n= 12... =() 
n 


for any y 7 0. In fact, the set in curly brackets consists of positive numbers only 
and hence has a nonnegative greatest lower bound. But, as just shown, this 
greatest lower bound cannot be positive, and hence must equal zero. 


1.74. corottary. Each of the following systems of half-open intervals has an 


empty intersection: 
0.12(03]>-2(02]>- o>, (3) 
(aaty]> (2043 |>-->(a042|>-~, (4) 
n 
[a—9)>| 23,2) >-->| a-2,2) > (5) 
2 n 


Proof. If the intervals of the system (4) had a common point €, then € — a would 
be a common point of the system (3), while if the intervals of the system (5) had 
a common point 7, then a — 7 would be a common point of the system (3). But 
the intervals of the system (3) cannot have any common points at all, because of 
Theorem 1.73. 


The corollary clearly remains true if y/n (n = 1, 2, ...) is replaced by y/nx(x 7 0 
arbitrary) or by y/x”~! (x 7 1 arbitrary), in particular, by y/107~ /. 


1.75. rHeorem. Every open interval (a, b) contains a rational point. 


Proof. Leth=b-—a7 0, and let n be an integer greater than 1/h (the existence of 
n follows from Theorem 1.71), so that 1/n © h. By Theorem 1.71 again, there 


exists an integer m such that ™ <a< m+1 
n n 


> 


where clearly 


l ] 
ete galt <b—a 
n n 


and hence 


m+] 
n 


<b. 


It follows that 


m+1 
a< ou < 8, 
n 


1.e., the rational number 


m+ 1 
n 


belongs to the interval (a, b). 


There are actually an infinite number of rational points in (a, 5). In fact, 
applying the above theorem to the interval (“2 l b 
n ° 


gives a new rational number p/q such that 


aly ee 
q 


n 
and this process can clearly be continued indefinitely. 


1.76. tHEoREM. Given any real number ¢, let Nz be the set of all rational numbers 
s S€ and let Pz be the set of all rational numbers r 2 &. Then sup Ne = ¢ = inf 


Pit 


Proof. Let a = sup Nz. Then, since s S € for every s . Nz, we have a S & by the 
very definition of the least upper bound. Suppose a < € By Theorem 1.75, there 
is a rational point p in (a, ¢). Since p < & we have p © N;, which implies p S sup 
Nz = a, contrary to the condition p © (a, &). Therefore the inequality a < € is 


impossible, and hence a = sup Nz=c¢. The fact that inf P: = ¢ is proved similarly. 
H 


1.77. Decimal representation of real numbers. Next we show how an arbitrary 
real number ¢ can be represented by a suitable sequence of the symbols 
(“digits”) 0, 1, 2, ..., 9. 


a. Suppose ¢ 7 0. By Theorem 1.72, there exists a (unique) integer p such that 
10° Me™ 10°". 


Having found the “exponent” p, we next find a number 6 (from the set 1, 2, ..., 
9) such that 0) - 10? SE 7 (G)+ 1) - 10? 


(where 9 + 1 = 10). The number 4p is also uniquely determined, since the 
intervals. 


6-10? Sx < (8+ 1)- 10°(6=0, 1, ..., 9) are nonintersecting. Next, having found 
Oy, we find a number 6, (from the set 0, 1, ..., 9) such that @) - 10? + 6, - 10? ~ | 
SESG)- 10? + (0, +1): 10°71. 


Continuing this process indefinitely, we get a sequence of symbols (digits from 0 
to 9) 999,42... (69 #0). (6) 


The underlying presence of the number p is indicated as follows: If p 2 0, we put 
a decimal point between the symbols 6, and 6, , ; while if p <0, ie., if p =—g, g 
> (0, we write g additional zeros in front of the sequence (6) and then put a 
decimal point after the first zero. In this way, the influence of the number p is 
reflected in the expression (6). 

Thus to every real number é 7 0 there corresponds, in accordance with this 
rule, an expression of the form (6) (possibly preceded by a “string of zeros’’), 
with a decimal point in some position. The expression (6) is called the decimal 
representation or expansion of ¢, with (decimal) digits 69, 0), 5, ... (in that 
order). The decimal representation of the number 1 is just 1.000 ..., and similarly 
for the numbers 2, 3, ..., 9. Similarly, the decimal representation of the number 
10 is 10.000 ..., while for numbers of the form a (s,t=0,1,2,...) 


(“rational decimals”), and only for such numbers, there are no more than ¢ 
nonzero digits after the decimal point. 


b. rneorem. /t is impossible for all the digits in an expression of the form (6) to be 
nines starting from some position, i.e., (6) cannot have an “infinite run of nines.” 


Proof. The presence of an infinite run of nines beginning with some number n 
after the decimal point (the “nth decimal place”) would mean that the number ¢ 
lies in all the intervals 


9 10 
Se Oe a Pie 
r+ Top S8 <0 Top = Tt Toe 
ee 
r+— +— 7S — + —— = r+—_,, 
10° jo" lo" lori 10"! 


(7) 
a oo a 
10". 10"*! 10"7* 10" 2” 


But this is impossible, since the system of intervals (7) has an empty 
intersection, by the remark following Corollary 1.74. ll 


c. THEOREM. Let 


TyTo0-. (8) 


be an arbitrary sequence of digits from 0 to 9, with a decimal point in some 
position, where not all the t; are zero and there are digits other than 9 
arbitrarily far from the decimal point. Then there exists areal number €7 0 with 
(8) as its decimal expansion. 


Proof. Let t,, be the first nonzero digit in (8). The decimal point is either to the 
right of t,, by g 7 () digits (not including T»)> Or to the left by ¢ 7 1 digits 
(including r,,); in the second case we set g = —t. We now show that the decimal 
expansion of the number ra defined as 
¢=sup{10%-t_+ 10t~ +o $1082 to tees +109 "+74, }, 
> 


s such that ,, + 


coincides with (8). Let s be a fixed positive integer, choose r 
< 8, and let k 7 r. Then, summing a geometric progression, we get 
ey a SO oe Ca AO oe Ua a 
S9- 1097844... 9+ 109-4. +9 +108 * — 109°" 
1909-GtH_ ] q-(k+1) 
=9- 1087 < 108781087, 


It follows that 
ans a sate i MN int Sa li 10*-* - cae} 


< 108-1, +--+ 108-7, + 108-8 — 108°" 
<10?+1,,+10°-"(t,,,+1), 


and hence 


107 - ln tot 10975 - (ae ™ 107 - Em, tio t+ 109~ S(t 


m+gt 1) for every s 
=0, 1, 2, ... Setting s = 0, 1, 2, ... and recalling the definition of the number p and 
the digits Op, 0), ... of the number ¢, we find that p = g, 99 = Ty), = 9) = Tm 415 


i.e., the decimal expansion of ¢ is just (8). li 
d. If €<0, then —€7 0, and hence —é = Tp ess 
as shown in Sec. 1.77c. We now set 

€ =F Wax 

by definition. Finally, if € = 0, we set 

0 = 0.0000... 


1.78. Suppose we repeat the considerations of Sec. 1.77, replacing the number 
10 figuring in the powers 10”, 10°*!, ... by some other integer P 7 1. This gives 
a so-called “P-ary system” for representing the real numbers. The most 
commonly encountered P-ary systems are the binary and ternary systems, where 
P = 2 and P = 3, respectively. In the binary system the representation of an 
arbitrary real number involves only the digits 0 and 1, while in the ternary 
system it involves only the digits 0, 1, and 2. 


1.8. The Principle of Nested Intervals 


1.81. A set O of intervals on the real line R with the property that given any two 
intervals either , J © O either 1— J or J = J, is called a system of nested 
intervals. According to Corollary 1.74, a system of nested half-open intervals 
may well have an empty intersection, and the same is all the more true of a 
system of nested open intervals. However, a system of nested closed intervals 
always has a nonempty intersection, as shown by the following proposition due 
to Cantor: rHeorem (Principle of nested intervals). Let QO be any system of 
nested closed intervals [a, b|. Then there exists at least one point of R belonging 
to every interval of @Q. More exactly, the numbers 


€=sup{a: [a,b]eQ}, n=inf{b: [a,b]eQ } (1) 


exist (6 Sn) and the interval [€, 7] is the intersection of all the intervals of O. + 


Proof. Let E = {a: [a, b] © O} be the set of all left-hand end points of the 
intervals of Q, and let F = {b: [a, b] © Q} be the set of all right-hand end points 
of the intervals of QO. Given any two intervals [a,, bj], [a>, by] © O, one must be 
contained in the other, say [a), b;] — [a), by], so that ay <a, <b, <b,. Hence no 
a © E exceeds any b © F. It follows from Theorem 1.62b that the numbers (1) 
exist and satisfy the inequality é S 7. For any [a, b] © O we have a SE Sy Sb, 
so that [a, b] > [€, 7] and hence H[a, b] > [, 7]. Moreover, I[a, b] consists only 
of points of the interval [é, 7]. In fact, for any point x € [¢, 7], e.g., for any x S&, 
there is a left-hand end point a such that x < a < € = sup {a}, and then x does not 
belong to the corresponding interval [a, 5]. 


1.82. The following theorem gives conditions under which the intersection of all 
the intervals of O reduces to a single point: rHrorem. The intersection of all the 
intervals of a system Q of nested closed intervals consists of a single point if and 
only if given any € 7 0, there is an interval [a, b] © O of length b — ae. 


Proof. By Theorem 1.81, the intersection of all the intervals of Q is an interval 
[¢, 7], which reduces to a single point if and only if ¢= 7. If ¢ # 7, then the length 
of every interval [a, b] > [¢, 1] of the system Q is no less than y — €, and hence if 
Q, contains intervals of arbitrarily small length, the intersection of all these 
intervals must reduce to a single point. Conversely, because of (1), given any ¢ 7 
0, there is an interval [a,, b;] © O such that a,>f— 5 


and an interval [a5, b>] = Q, such that 7<nt =: 


& 


If [a,, Dy] a [a, bp], say, then a,>a,>€--, b,<n+ >" 


ho 


Hence, if €= 7, then 


by — a) < ¢, so that O contains an interval of length less thane. ff 


1.9. The Extended Real Number System 


1.91. Definition. By the extended real number system R is meant the set 
consisting of the elements of the real number system R and two symbols or 
“points” —oo and oo (more exactly, —oo and +00), called “minus infinity” and “plus 
infinity.”+ The order relations are extended to these symbols by the following 


< 


rules: (a) —oo x for every x © R; (b) x <0 for every x © R; (c) -~0 So. 


The axioms of order (Sec. 1.23) continue to hold in the extended system R. 
The ordinary real numbers (elements of R) are said to be finite, as opposed to the 
symbols —oo and ©. 


1.92. Given any nonempty set E — R, the expressions sup E and inf E are 
defined as follows: If E does not contain the point o and is bounded from above, 
then sup E has the same meaning as in Sec. 1.24; otherwiset we set sup E = o. 
Similarly, if £ does not contain the point —oo and is bounded from below, then 
inf E has the same meaning as in Sec. 1.61; otherwise we set inf E = — 0. Thus 
every nonempty subset of the extended real number system R has a least upper 
bound and a greatest lower bound. 


1.93. If a and b (a <b) are any two points of R, then, just as in the case of R (see 
Sec. 1.67), the set [a, b] = {x © R: a Sd} 


is called a closed interval with end points a and b, while the set (a, b) = {x © R: a 
=.= 
x ™~ bt 


is called an open interval with end points a and b. 


1.94. The principle of nested intervals (Theorem 1.81) is readily generalized to 
the case of the extended real line R. Let O be any system of closed intervals on R 
which are nested in the same sense as in Sec. 1.81. Then there exists at least one 
point of R belonging to every interval of Q, i.e., the intersection of all the 
intervals of Q is nonempty. In fact, this intersection is just the interval [¢, 7], 
where é= sup {a: [a, b] © O},n = inf{b: [a, b] © O}(€ <7), as in Sec. 1.81. The 
proof is virtually the same as for the ordinary principle of nested intervals (give 
the details). 


Problems 


1. Let addition and multiplication of the elements of the set {0, 1} be defined by 
the rules 0+0=0, 0+1=1, 1+0=1, 1+1=0, 


0-0=0, 0-1=0, 1-O=0, l-] =}, 
Verify that {0, 1} is a field. 
2. Prove that (a) |x — y| 2 [|x| — [y|| for all real x, y; (b) |x — yy — * —yyl 2lbe-b41 - 


-- —ly,|| for all real x, yy, ..., y,- 


3. Prove that the sum (or product) of a rational number and an irrational number 
is irrational. Can the sum (or product) of two irrational numbers be rational? 


4. Prove that ,/3 is irrational. 
5. Which is larger, /3 + /5 or \/2 + ,/62 


6. Prove that 4 4 1 >2 
a 


if a is a positive real number. 


7. Let x1, X, ..., x, be positive real numbers. Prove that if x,x, ... x, = 1, then x, + 


7 n, where equality occurs if and only if Mp a 2G, 


2.) Sant Xn 
8. The geometric mean of the positive real numbers x), x, ..., x, 18 defined by 
sain 


and the arithmetic mean by 


1 teat +H, 
: _ 
Prove that g <a unless x, = x5 =... =X, in which case g = a. 


9. Prove that »1_1.9.. n<(F) (n>2) 
2 + 


10. Solve the following equations: (a) |x — 1| = 2;(b) |x — 1] = |3 — x|;(c) |x + 1] = 
[3 + xlz(d) 2x) = |e — 2]. 


11. Find the set of all real x such that 2—# - l 


2x+3 3 
12. Find the set of all real x such that |x + 1| + |x — 1] 7 2 and |x — 2| S6. 


13. Let A = {a, a’, a, ...}, where 0 Sa $1. Find max A, min A, sup A, and inf A. 
What happens if a > 1? Discuss the case of negative a. 


14. Give an example where [|x|]  |[x]]. 
15. Prove that =4 33°" poo <7: 


16. By a “repeating decimal” is meant a decimal like } = 0.333333..., where a 
single digit (possibly 0) “repeats itself forever” or a decimal like 7 = 


0.142857142857..., where a whole block of digits repeats itself forever. Prove 
that every repeating decimal represents a rational number. 


17. What are the rational numbers represented by the following repeating 
decimals: (a) 0.919999...;(b) 0.417417...;(c) 2.331331...;(d) —8.191919...? 


18. By the arithmetic sum A + B two numerical sets 4 and B is meant the set of 
all sums x + y where x © A, y © B. Prove that if A and B are bounded from above, 
then 4 + B is bounded from above and sup (4 + B) = sup A + sup B. 


19. By the arithmetic product AB of two numerical sets A and B is meant the set 
of all products xy where x © A, y © B. Prove that if A and B are bounded from 
above and consist of positive numbers only, then 4B is bounded from above and 
sup AB = sup A - sup B. 


20. By the arithmetic nth power A" of a numerical set A is meant the set of all 
numbers x” where x © 4. (Note that in general A? 4 AA.) Prove that if A is 


bounded from above and consists of positive numbers only, then A” is bounded 
from above and sup A” = (sup A)". 


21. Prove that if the set of all real numbers is partitioned into two nonempty 
nonintersecting subsets A and B such that a <b for all a © A, b © B (a “Dedekind 
cut”), then there exists one and only one number y such that a S y $5 for all a © 
A, b©B. 


+ “Some people, oh king Hieron, think that the number of grains of sand in the world is infinitely large. 
However, I can give you a convincing proof of my ability to name certain numbers larger than the number 
of grains of sand in a pile as large as the Earth itself.” (Archimedes) + It is stretching things quite a bit to 
say that the least upper bound axiom (Sec. 1.24) is known from “everyday experience.” But the same is true 
of Euclid’s axiom on the existence of a unique line through a given point parallel to a given line (a key 
axiom of ordinary plane geometry). Experience does not present us with uniquely determined sets of 
mathematical axioms. Rather, between experience and any scientific theory there is an intervening stage of 
formulating appropriate axioms, which may vary greatly within the context of one and the same 
experimental data. Thus besides Euclidean geometry we have non-Euclidean (Lobachevskian) geometry in 
which there are many lines through a given point parallel to a given line. By the same token, there are other 
theories of the real numbers besides the one given here, in which a set bounded from above can fail to have 
a least upper bound. See V. A. Uspensky, Lectures on Computable Functions (in Russian), Moscow (1960), 
Sec. 12. 


{ Thus the expression x + y + z has a unique meaning. 

+ Thus the expression xyz has a unique meaning. 

+ The symbol HB means Q.E.D. and indicates the end of a proof. 

+ Theorem 1.33 refers to the (unique) theorem in Sec. 1.33, Lemma 1.51a to the lemma in Sec. 1.51a, etc. 
+ A proof is necessary, since each side of (3) is defined independently. 

t Note the sense in which Lemmas 1.51b and 1.51d strengthen Axioms b and c of Sec. 1.23. 


+ The definitions in this section are obvious analogues of those given in Sec. 1.24. 


+ There are other modern axiomatic treatments of the real numbers in which the principle of Archimedes 
and the principle of nested intervals (see Sec. 1.8) appear as axioms, with the least upper bound axiom of 
Sec. 1.24 then becoming a theorem. 


{ The numbers (” — 1)x and nx are called integral multiples of x, for an obvious reason. 
t+ Note that Nz is bounded from above (by ¢), while P¢ is bounded from below (by ¢). 


+ If =y, then by the interval [&, 7] we mean just the point € = 7. 
+ The set R is often called the extended real line. 
t That is, if F contains ©, or if F fails to contain oo but is not bounded from above. 


2 Sets 


2.1. Operations on Sets 


2.11. The operations on sets introduced in Sec. 1.12 will now be examined in 
more detail. It will be recalled that the union of given sets A, B, C, ... is defined 
as the set of all elements belonging to at least one of the sets A, B, C, ..., while 
the intersection of the given sets is defined as the set of all elements belonging to 
every one of the sets A, B, C, ... 

The union S and intersection D of the sets A, B, C, ... are sometimes called the 
sum and product, respectively, + and correspondingly we write 


D4 eC $d SHADE ee 


There is some justification for this “arithmetic language,” as shown by the 
following 


THEOREM. Zhe formula 
(A+B)C=AC+BC (1) 


holds for any three sets A, B, and C. 


Proof. Recalling that equality of two sets means that every element of one set is 
also an element of the other set, we see that to prove (1) it must be shown that 
every element x belonging to the left-hand side (A + B)C also belongs to the 
right-hand side 4B + AC, and conversely that every element y belonging to the 
right-hand side AB + AC also belongs to the left-hand side (A + B)C. 

Thus suppose x belongs to (4 + B)C. Then, being an element of the 
intersection of A + B and C, x must belong to both 4 + B and C, so that x © 4 + 
B, x © C. Since x belongs to the union of A and B, x must belong to at least one of 
the sets A and B, say to A. But x © A, x © C together imply x © AC and hence x © 
AC + BC. If, on the other hand, x belongs to B rather than to A, we deduce in the 
same way that x © BC and hence x © AC + BC. 

Conversely, suppose y belongs to the sum AC + BC. Then y belongs to (at 
least) one of the sets AC and BC, say to BC, so that y © B, y©C. Buty © B 
implies y © A + B, and hence finally y © (4 + B)C. The case y © AC is analyzed sit 


It must not be thought that all the rules of arithmetic carry over to operations 
on sets. For example, given arbitrary sets 4, B, and C, we have the formulast 


A+A=A, 
AA=A, 
A+BC=(A+B\(A+O), 


none of which makes any sense as a piece of ordinary arithmetic. 


2.12. Next we introduce a new operation, known as complementation. Let B be a 
subset of the set A. Then the set of all elements of A not belonging to B, denoted 
by CB or A — B, 1s called the complement of B (relative to A). Obviously, we 
have 


C(C B)=A-(A-B)=B, 
(A-B)+B=A. 


Note, however, that the formula 
(4+ B)-B=A 
is in general false, and in fact holds only if A and B have no common elements. 


THEOREM. Zhe formula 

Cc y'B,=[]CB2, (2) 
holds for arbitrary sets B,,% i.e., the complement of a union of sets equals the 
intersection of the complements of the sets. 


Proof. If 
xeCyB,, (3) 


then 
x¢ XB i (4) 


i.e., for every v, x € B, or equivalently x e CB,,. But then 


xe []CB,. (5) 


Conversely, suppose (5) holds. Then for every v, x © CB,, or equivalently x €¢ B,,. 
But then (4) holds, which implies (3). ff 
Applying the operation C to both sides of (2) and setting A, = CB,, we get 


yC4,=C]IA,, (2') 


i. e., the complement of an intersection of sets equals the union of the 
complements of the sets. 
Formulas (2) and (2') can be combined into the following simple rule: Jn 


interchanging the symbol C with the symbols X and UH, we must change X to 
ID and UV to X. 


2.2. Equivalence of Sets 


We now establish rules allowing us to compare different sets with regard to the 
“number of elements” in each. In the case of finite sets, there is no problem at 
all. In fact, given two finite sets A and B, we need only count the number of 
elements in each set to ascertain which of the sets (if either) has more elements 
than the other, regarding A and B as equivalent if both sets turn out to have the 
same number of elements. As it stands, this definition of equivalence is not 
immediately applicable to the case of infinite sets, and we now cast it in another 
form making it suitable for infinite sets. To this end, we note that in establishing 
the equivalence or nonequivalence of two finite sets A and B, there is actually no 
need to count the number of elements in each set. For example, if A is the set of 
people in an auditorium and B the set of seats in the auditorium, then, instead of 
counting people and seats separately, we can immediately ascertain, without 
calculations, whether the two sets A and B are equivalent or not (equivalent if 
there are no empty seats or standees, nonequivalent otherwise). This leads to the 
key idea of establishing a one-to-one correspondence between two sets A and B. 


2.21. Definition. Given two sets A and B, suppose a unique element b © B is 
associated with each element a © A, in accordance with some rule (or 
“mapping”), a fact indicated by writing a ~ b. Suppose further that each b © B is 
associated with one and only one element a © A. Then A and B are said to be “in 
one-to-one correspondence,” a fact indicated by writing a<b, and the rule in 
question is called a one-to-one correspondence (or “one-to-one mapping”). Two 
sets A and B are said to be equivalent if they are in one-to-one correspondence. 

The new definition of equivalence works for arbitrary sets, not just for finite 
sets. For example, the infinite set A of all positive integers 1, 2, ... is equivalent 
to the set B of all negative integers — 1, — 2, ... In fact, here the appropriate one- 
to-one correspondence is just n<> — n. In just the same way, the set of all 
positive integers 1, 2, ... is equivalent to the set of all even integers 2, 4, ..., with 
n<>2n as the appropriate one-to-one correspondence. The last example shows 
that a set can be equivalent to one of its proper subsets. Naturally, this can only 
happen if the set is infinite. 

If two sets A and B are equivalent, we write A ~ B. It is easy to see that the 
relation ~ is symmetric (A ~ A), reflexive (if A ~ B, then B ~ A), and transitive 
(if 4~ Band B™~ C, then A ~C). Equivalent sets are also said to have the same 
power. 


2.22. Examples. The interval [0, a], a7 0 is equivalent to the unit interval [0, 1], 
with x © [0, 1]<+v = ax © [0, a] as the appropriate correspondence. Moreover, any 
two intervals [a, b] and [a + h, b + h] of equal length b — a are equivalent, with x 
© la, bley=x+h*©[at+h, b +h] as the correspondence. It follows that any two 


closed intervals on the real line R are equivalent, and the same is true of any two 
open intervals on R. 


The correspondence 
xeory =- 
x 


establishes the equivalence of the interval 0 < x <1 and the half-line y 7 1, 
while the correspondence 


=. ¢ 


establishes the equivalence of the whole real line — 0 x ™ o and the interval — 


ty * 1, 

THEOREM. Zhe closed interval [a, b| and the open interval (a, b) are equivalent. 
Proof. Let A be any sequence (see Sec. 2.81) of distinct points 

HG AH 0 Mis tes 


of the closed interval [a, b], with the end points a and 5 as the first two points. 
Clearly, the points x3, x4, ... and any point y ¢ A all belong to the open interval 


(a, b). The rule 
X 1 +>%3; 


Xyt7X 4; 
Xn Xn +25 


y¢ Avy 


then establishes a one-to-one correspondence between [a, b] and (a,b). fl 
It follows from the above considerations that any open interval is equivalent to 
any closed interval, to any half-line, or even to the whole real line R. 


2.3. Countable Sets 


2.31. Definition. A set equivalent to the set of all natural numbers 1, 2, ... is said 
to be countable. In other words, a set is called countable if all its elements can be 
“numbered” or “counted,” 1.e., labeled with the positive integers. We now prove 
some simple theorems involving countable sets. 


2.32. THEOREM. Every infinite subset B of a countable set A is itself countable. 


Proof. We need only renumber the elements of B in the order of their appearance 
in the original numbering of the elements of A (since B is infinite, this will 
require all the natural numbers). 


2.33. THEorEM. The union of a finite or countable collection of countable sets is 
itself a countable set. 


Proof. First we consider the case of two countable sets A = {a), do, ...} and B= 
{b,, bo, ...}. Writing all the elements of both sets in a single row 


@1 5; ,@2,b2,43,53,...5 (1) 


we can then renumber the elements in the order of their appearance in (1), with 
the proviso that any element appearing twice (any element belonging to both 4 
and B) is numbered when it appears for the first time and omitted when it 
appears for the second time. This gives every element in the union of A and Ba 
number, as required. 

The theorem is proved in a completely analogous way in the case of three, 
four, or, more generally, any finite number of countable sets. In the case of a 
countable collection of countable sets 


A, = Ses hpaiiviMiggsss i 


A,= {@21,€225--+sBans--+}> 


Ay = {Gigs p35+++9Oegs-++}s 

we need only use a somewhat different rule for writing all the elements of 4), 
A, ..., Aj, ... in a single row, with the rest of the proof remaining the same. For 
example, we can group elements with the same sum of indices, writing 


Ay] 5 Ay], A195 5 4315 p25 413 5 Agi, 432, O73, dig; 


2.34. tHeorem. The set of all rational numbers (i.e., numbers of the form p/q 
where p and q # O are integers) is countable. 


Proof. The set QO of all rational numbers is clearly the union of the following 
countable collection of countable sets: 


A, ={n: n=0,+1,+2,...}, 
A, = {n/2: n=0,+1,42,...}, 


A, ={njk: 2=0,+1,+2,...}, 


Hence Q is itself countable, by Theorem 2.33. ff 
2.35. THEOREM. Given two countable sets 

A = {j, Ap, «.-5 Ap, ..-} 

and 

B= Ag Dis greg D ps tents 


the set of all ordered pairs 
(a,,5,,) (k,n = 1,2,...) (2) 
is itself countable. 


Proof. The set C of all pairs (2) is clearly the union of the following countable 
collection of countable sets: 


A, ={(a;,5,), (a,,52),.-., (a;,6,),---}, 
A,={(a2,5,), (42,b2),..+5 (5 Bil yeu t's 


A= { (a3), (a,5b),.-+, (a,5b,),+++} 


Hence C is itself countable, again by Theorem 2.33. I 


Geometrically, the pair (a;, b,) corresponds to the point in the plane with 
coordinates a;,, b,. Thus Theorem 2.35 implies in particular that the set of all 
points in the plane with rational coordinates is countable. 


2.36. THEorEM. The set of all polynomials 
PO) =p Ga nc 


(of arbitrary degree) with rational coefficients ag, a,..., A, is countable. 


Proof. The set of all polynomials with rational coefficients is the union of the 
countable collection of sets Ag, A), ..., A,, ... Where A, denotes the set of all 
polynomials of degree Sn with rational coefficients. Hence, by Theorem 2.33, 
we need only show that each set A,, is countable. For n = 0, A, reduces to the set 


n? 


of all rational numbers, which is countable by Theorem 2.34. We now invoke 
mathematical induction, showing that countability of the set A, implies that of 


the set A, ,. Every element of A, , ; can be written in the form 


O(x) + ay 4 x"), 


where a,, ,. ; is a rational number and Q(x) is a polynomial of degree <n with 


rational coefficients (i.e., an element of A,). The set of all polynomials Q(x) is 
countable, by the induction hypothesis, and the set of all numbers a, ; is also 
countable. Since every element of A, , ; can be assigned a pair (Q(x), a, + 1), 
where Q(x) and a, , ; both range over countable sets, it follows from Theorem 
2.35 that A,, ,. , is countable, and the induction is complete. 


2.37. tHEorEM. The set of all algebraic numbers (i.e., roots of polynomials with 
rational coefficients) is countable. 


Proof. According to Theorem 2.36, the set of all polynomials with rational 
coefficients can be “numbered” with positive integers, thereby forming a 
sequence 


P,(x), P>(X), ..., P(X), «.- 


But each of these polynomials has only a finite number of roots (see Sec. 5.87). 
Writing down all the roots of P,(x), then all the roots of P,(x) in the same row, 


and so on, we succeed in “counting” all the algebraic numbers. I 


2.4. Uncountable Sets 


An infinite set which is not countable, i.e., whose elements cannot be put into a 
one-to-one correspondence with the positive integers, is said to be uncountable. 
A typical uncountable set is the continuum, namely the set of all points in a 
closed interval. This is shown by the following theorem due to Cantor (1874): 


2.41. tHeorem. The set of all points in the unit interval [0, 1] is uncountable. 


Proof. Suppose to the contrary that the set of all points in the interval [0, 1] is 
countable, so that these points can be arranged in a sequence 


Bab sgveny Rapiws (1) 


Starting from (1), we construct a sequence of nested closed intervals as follows. 
First we divide the interval [0, 1] into three equal parts. Regardless of the 
location of x, it cannot belong simultaneously to all three intervals [0, 48. #1 
$, 1], and hence one of these intervals must fail to contain x, (either inside the 
interval or on its boundary). Let A, denote this interval. We now divide A, in 
turn into three equal parts, choosing a part A, which does not contain x,. More 
generally, once having used this procedure to construct the intervals A, = Ay zi 
.. > A, where x, € A,,, we divide A, into three equal parts, choosing a part A, + | 
such that x, , ; € A, 4 1, and so on indefinitely. By Theorem 1.81, the resulting 
infinite sequence of nested intervals A, > A, > ... 7 A, > 
intersection, containing some point C. Since ¢ belongs to all the intervals A,, it 
cannot coincide with any of the points x,. But then, contrary to our assumption, 


the sequence (1) does not include every point in [0, 1], and the theorem is proved 
by contradiction. 


... has a nonempty 


2.42. It has already been shown (see Theorem 2.34) that the set of all rational 
numbers in [0, 1] is countable. Calling the remaining numbers irrational, we 
now see that [0, 1] contains “many more” irrational numbers than rational 
numbers. More exactly, the irrational numbers in [0, 1] form an uncountable set, 
since if the set of all irrational numbers in [0, 1] were countable, the set of all 
numbers in [0, 1] would also be countable (as the union of two countable sets). 
Moreover, the set of all irrational algebraic numbers, i.e., irrational roots of 
polynomials with rational coefficients, is countable, by Theorem 2.37. Hence the 
set of all transcendental numbers, 1.e., numbers which are not roots of 
polynomials with rational coefficients, must be uncountable. Incidentally, this 
proves the very existence of transcendental numbers, a fact which is hardly 
obvious in advance. 


2.43. Definition. Every set equivalent to the set of all points in the interval [0, 1] 
is called a set with the power of the continuum. 

According to Sec. 2.22, the set of points in every closed interval, in every 
open interval, or even in the whole real line — 00 <x < «0, is equivalent to the set 
of points in [0, 1]. Hence these sets all have the power of the continuum. 


2.5. Mathematical Structures 


2.51. From a very general point of view, mathematics deals only with sets. But 
the richness of one or another mathematical theory depends on further 
connections between the elements (or subsets) of the sets figuring in the given 
theory. These connections are formulated abstractly with the help of axioms. For 
example, in Chapter 1 the real number system R is defined as a set whose 
elements satisfy a certain rather complicated system of axioms. A set equipped 
with extra conditions (on its elements or subsets) is called a mathematical 
structure. An exact definition of a mathematical structure should contain a 
general definition of such conditions, and can be given. + Rather than go into 
this matter here, we confine ourselves to the above remarks, to be illustrated by a 
number of examples. 


2.52. Isomorphic structures. Automorphisms. Two structures equipped with 
conditions of the same kind are said to be isomorphic if there is a one-to-one 
correspondence between the elements of the structures which preserves the 
validity (or nonvalidity) of the conditions defining the given structure. Every 
structure is isomorphic to itself, since the identity mapping (carrying every 
element into itself) is obviously one-to-one and preserves all conditions satisfied 
by the elements and subsets of the structure. But there may also be nonidentical 
one-to-one mappings of a structure onto itself (cf. Sec. 2.81) which preserve the 
conditions defining the structure. Such mappings are called automorphisms of 
the structure. In other words, an automorphism of a structure is an isomorphism 
between the structure and itself. 

For example, consider the structure known as a /inearly ordered set, by which 
we mean a set F of elements x, y, ... with the extra condition that given any two 
distinct elements x and y, either x < y (“x is less than y”) or y < x, where 
moreover x * y, y Sz implies x < y (“transitivity of the relation ©”). Thus the 
(extended) real line is a linearly ordered set, and so is any subset of the real line 
(with the usual order relation *). Two linearly ordered sets E , and F, are said to 


be isomorphic if there is a one-to-one correspondence between £, and £, such 
that x, © E,, y, © E), x; Sy, implies x, Sy, for the corresponding elements x © 
E>, yy © E>. In this sense, the whole real line is isomorphic to any open interval, 


but not to a closed interval (unlike a closed interval, the real line has no least 
element). An automorphism of a linearly ordered set is given by any one-to-one 


mapping<>of the set onto itself which preserves order (i.e., if x; v1, x;<2%, 


y|<y>, then y, <5). Thus the formula x"<y represents an automorphism of the 


interval (0, 1), regarded as a linearly ordered set. 

There are structures whose systems of axioms define them to within an 
isomorphism, i.e., such that any two structures satisfying the given system of 
axioms are isomorphic. The system of axioms defining the given structure is 
then called complete, and the structure itself is said to be unique. 


2.53. To illustrate the above remarks, we now investigate the uniqueness of the 
real number system. It will be recalled from Chapter 1 that the real number 
system R is defined as any set of objects satisfying the axioms of Secs. 
1.21-1.24. This suggests the following question: Is any system of objects 
satisfying all these axioms unique, i.e., are any two structures of real numbers 
isomorphic? More exactly, given any two “specimens” of the real number 
system, can we establish a one-to-one correspondence between the elements of 
the specimens such that the results of adding and multiplying numbers of the 
first specimen correspond to the results of adding and multiplying corresponding 
numbers of the second system, and such that order relations between both 
corresponding elements and least upper bounds of corresponding subsets are 
preserved? The answer to this question is in the affirmative, as shown by the 
following 


THEOREM. Let R, and R, be two specimens of the real number system, where 
elements of R, are equipped with the subscript 1 and those of R», with the 
subscript 2. Then there exists a one-to-one correspondence x,<>x, between the 
elements x,,© R, and x, © Ry such that 

(1) Ifx1<>X9, Vs? V2, then x1 + y,<>Xxz + yo; 

(2) If X1<?X2, V1 V2, then xy) ?XpV9; 

(3) If xpx2, VV 2X1 Wp, then xp Typ; 

(4) If the set A, = R, is bounded from above, then so is the set Ay consisting of 
the corresponding elements of R,, and moreover 


sup A,<>sup A). 


Proof. The proof will be given in several steps (Secs. 2.54—2.58). First we note 
in passing that the isomorphism of R, and R, as structures is already implicit in 


the possibility of representing the elements of R, and R, as decimals, as in Sec. 
1.77. However, here we use a different proof, which is particularly suited to the 


direct verification of Properties 1-4 above. 


2.54. Step 1. Let Q1 be the set of all rational numbers in R, and Q, the set of all 
rational numbers in R,. Then there is a natural one-to-one correspondence 
between Q, and Q), established by the very way of writing rational numbers in 
the form m/n. Properties 1-3 obviously hold for the elements of QO, and Q,, since 


the operations and relations in question can be expressed directly in terms of the 
symbols m and n. 
Now let &, © R; be arbitrary. Then, by Theorem 1.76, 


€) = sup Nj (¢)) = inf P)(¢)), 


where N,(,) is the set of all rational numbers 7, S é, and P,(é,) is the set of all 
rational numbers s, 2 <, (note that 7, Ss, always holds). Let N, be the set of all 
rational numbers in R, corresponding to the rational numbers in N, (¢)), 1.e., 
with the same representations as fractions of the form m/n, and let P, be the set 
of all rational numbers in R, corresponding to the rational numbers in P, (¢)). 
Then clearly r, Ss, for all r, © N5, sy © P>, and hence 


sup N, Sinf P,, 


by Theorem 1.62b. Since all the rational numbers in R, fall either in N, or in P5, 
there are no rational numbers between a, =sup N, and f, = inf P,. But then a, = 
f5, since otherwise there would be a rational point between a, and f,, by 
Theorem 1.75. We now associate the number ¢, = a) = £ © R, with the given 
number ¢, © R,. The fact that this correspondence is one-to-one follows by 
“symmetry” (give the details). 


2.55. Step 2. Next we prove that the correspondence <> just constructed has 
Properties 3 and 4. If x, S,, then, by Theorem 1.75, there is a rational number 
r, such that x; Sr, Sy,. But x, <5 Sy, for the corresponding numbers x,, >, y> 
F R,, by the very definition of the correspondence, and hence Property 3 holds. 
Next, given any set 4, — R, which is bounded from above, let €, = sup A), let 4, 
= R, be the set corresponding to A,, and let ¢ © R, be the number corresponding 
to é;. Then é, 2 x, for every x, © A, and hence é, 2 x, for every x, © A, so that 
sup A, S<é, (in particular, this shows that A, is bounded from above). If sup A, S 


é, then there is a number (in fact, a rational number) y, such that x, Sy, Sé, for 
every x, © A), so that x, Sy, SE, for every x, © A;, where y, corresponds to >. 
But then 


E =sup 4,3, ear 
which is impossible. Therefore sup A, = ¢, which proves Property 4. 


2.56. Step 3. To prove Properties 1 and 2, we will need the following two 
lemmas: 


a. Lemma. Given any real numbers y, z, and any rational number r 7y + z, there 
are rational numbers v 7 y, v7 z such that u+ v=r. 


Proof. Let r — (vy + z) = h. Choosing u to be any rational number in the interval 
(vy, y +h), let v=r-—v. Then obviously v is rational and 


v7 r(yth)=z. i 


b. Lemma. Given any positive real numbers y, z and any rational number r 7 yz, 
there are rational numbers v7” y, v7 z such that uv =r. 


Proof. Let r/yz =h 7 1. Choosing u to be any rational number in the interval (y, 
yh), let v=r/u. Then obviously v is rational and 


v7 rlyh=z. O 


2.57. Step 4. We are now able to verify Property 1. Suppose x) <x>, yj<?V>, x, + 


V1<2Z> # Xy + yy, where z, 7 x5 + y>, say, and let t, be a rational number in the 
interval (x, + y>, Z)). Then, by Lemma 2.56a, there are rational numbers u 7 x5, 
V> 7 yy such that uw, + vy = t. For the corresponding numbers 2), t), U1, Vj, we 
have XxX] = Uj, V1> < V1. xX] Vy < uy + y= ty. But ~ Z9<7X] +y, implies ty < Xj aa 
y,, which is incompatible with t, © (x, + y>, z). An analogous argument shows 
that z,< x, +> is also impossible, and hence 


Zy =X + Vo. 


2.58. Final step. Turning to Property 2, we first consider the case of positive x, 


and y,, which by Property 3 corresponds to positive x, and y5. Suppose x,y) <=2Z5 


# Xo) where zy ~ x5), say, and let t, be a rational number in the interval (xy5, 
z>). Then, by Lemma 2.56b, there are rational numbers uw 7 x5, v) 7 yy such that 
UV> = ty. For the corresponding numbers z,, f;, u;, v; we have x;* u,, y)< v1, 
xy, Suv, = t,. But tS z,<>x,y, implies t, <x,y,, which is incompatible with f, 
(x55, Z>). An analogous argument shows that z, < x,y, is also impossible, and 
hence 27 = XV. 

To prove the general case, we need only verify that - 1 - x, - 1°: x, 


(because of the usual sign rule for multiplication). Suppose — 1 - x,<> z). It 
follows from — 1 - x; + x; = 0 that z, + x, = 0 and hence z, = — 1 - xy. The proof 
of our theorem is now complete. 


2.59. Having just proved the uniqueness of the real number system R to within 
an isomorphism, we now look for automorphisms of R, by which we mean one- 
to-one mappings of R onto itself preserving sums, products, inequalities, and 
least upper bounds (cf. Secs. 2.52—2.53). It turns out that the only automorphism 
of R is the identity mapping, carrying every element of RF into itself. To see this, 
consider an automorphism of R carrying the element x © R into the element x’ © 
R. Then the elements O' and 1’ are the zero and unit elements in the field R, since 
O'+x'=x' and 1’: x'=x' for every x’. Therefore 0'= 0, 1'= 1, by the uniqueness 
of the zero and unit elements (Theorems 1.31 and 1.41a). It follows that 


2=1'+1'=2,..,0=(- ltl =n 


(m= 0,1,2,...; #2 1,2,...). 


Moreover 


by the uniqueness of the negative element (Theorem 1.32). Thus our 
automorphism leaves the rational numbers unchanged. But then it also leaves 
every real number € unchanged, since, by Theorem 1.76, 


c= sup Ne, 


where NV; is the set of all rational numbers < &, and any automorphism of R must 
preserve inequalities and least upper bounds. Hence our automorphism is just the 
identity mapping €<=6€. 


2.6. n-Dimensional Space 


2.61. Definition. Given a positive integer n, let R, be the set of all ordered n- 
tuples 
oad 5 re 


consisting of nm real numbers x), ..., x,. Then x is called a vector, with 


- 
components x1, ...,X,. | Two vectors x = (x, ..., X,) and y = (j, ..., ¥,) are said to 
be equal if 


MA Vinwaltye = va 
The set R,, itself is called n-dimensional (real) space. 
2.62. The operation of addition of elements of R,, is defined by the formula 

(3 52++y%q) F(a 52++9 In) = (1 AI 19-22% tn) (1) 


Addition in R,, satisfies all the axioms of Sec. 1.21, with the symbols x, y, ... 
regarded as elements of R,,. In fact, if x = (X1, «.., X,). V = Oy, +3 Vado Z = (25 
z,), then clearly 


(A) XV Oy TVigiy hy FY) HV Eo, 


(b) (ey) +2 Oy F97 F 25 es Xp FP Vy PZ) KLTV PZ); 
(c) R, contains an element 0 = (0, ..., 0), called the zero vector, such that x + 0 = 
(x1, ---5 X,) + (0, ..., 0) =x for every x . Ke 
(d) For every x © R,, there exists an element y . R,, called the negative (or 
opposite) of the vector x, such that x + y = 0, namely the element y = (—y, ..., 
—x,,) 

Naturally, addition in R,, satisfies not only Axioms a-d, but also all the 
consequences of these axioms (given in Sec. 1.3). 


2.63. The operation of multiplication of elements of R, by real numbers a © Ris 
defined by the formula 


O6(%15+++)%q) = (0% 1,---,00%,). (2) 


Then obviously 


a(x +y) =ax+ ay, 
(a+ B)x=ax+ Bx, 
l-x=x, 


a(Bx) = (aB)x. 
2.64. Definition. We say that m vectors 


OED ae (3D DD). OND ame (x), rel?) 
in the space R, are linearly dependent if there exist constants c, ..., C,, not all 
zero, such that 

cx) 4. tee = 0, (3) 
If (3) holds only if c, = ... = c,, = 0, the vectors x, ..., x” are called linearly 
independent. 


According to (1) and (2), the “vector equation” (3) is equivalent to the 
following system of “numerical equations”: 
Cy? toes beg” =0, 
C104) +0 tet” = 0, (3’) 


6x) 4 oe $0 = 0. 


The reader familiar with linear algebra will infer from (3’) that the vectors x), 
..., x” are linearly dependent if and only if the rank of the matrix ||x/|| is less 
than m. + It should also be noted that any nonsingular square matrix |x‘|] of 
order n specifies a system of n linearly independent vectors in R,, while any n + 


1 vectors in R,, are linearly dependent. 


2.65. Let f,, ..., f, be any system of n linearly independent vectors in R,. Then, 
given any vector g © R,,, it follows from the linear dependence of the vectors /,, 


wo Sys & that 
Cof ty fit +6,f,=9, (4) 


where certainly cp # 0, since otherwise the vectors f{, ..., f, would be linearly 


dependent. Hence we can solve (4) for g, obtaining a relation of the form 
GAO fr act Ons. 


The numbers aj, ..., a, are called the components of the vector g with respect to 
the basis fy, ..., f,,- 
For example, choosing f,, ...,f,, to be the vectors 


e, =(1,0,...,0),..-, @ = (0, 0,...,1), (5) 
we find that any vector x = (xj, ..., x,) © R, can be represented in the form 
MS Ney Tia, FAC 


so that the numbers x), ..., x,, are just the components of the vector x with respect 
to the special basis (5). 


2.66. Given a fixed basis /;, ..., f, in the space R,,, let 


CHO hat Oy a Pi ae pels 
Then, by Sec. 2.63, 


gth=(a, +B fi +++ + (a, + By) n 
cg=ca,f,+-:-+ca,/f, for every ce R, 


so that addition of vectors and multiplication of a vector by a number c leads (in 
any basis) to addition of the corresponding components and multiplication of 
these components by c. Thus the mapping which associates the vector (a), ..., @,) 


with every vector x = (x), ..., x,), Where a, ..., @, are the components of x with 
respect to a fixed basis f{, ..., f, 18 an automorphism (Sec. 2.52) of the n- 
dimensional space R,. 


2.67.1 More generally, let A be any automorphism of the n-dimensional space 
R,. Suppose A carries the element x © R,, into the element x’ © R,,, a fact indicated 


by writing A: x'*x' or x! = A(x). Being an automorphism of R,,, 4 preserves sums 
of vectors and products of vectors with real numbers, so that 


(a) A(x + y) = A(x) + AG) for every x, y © R,, 
(b) A(ax) = aA(x) for every x © R,, and a ER. 


Properties a and b immediately imply the following more general formula: 
(c) A $ ue) aAlx,) fOr every 2.25%, SR ANd Oy ax: ay ER. 
k=1 k=1 


The effect of the automorphism 4 on the components of the vector x can be 
expressed in terms of its effect on the vectors of the basis (5). Let 


R= (geet I HA) HN ee), 


and suppose 
th =A (ly) = (Gt gs+++5Enk)» 


where e; is the Ath vector of the basis (5), with 1 as its Ath component and 0 for 
all its other components. Then clearly 


x’ = xej= Ale) =A( S ns) = y x,A(¢,) 
j=1 k=1 k=1 
= ¥ a(S ants) = ee (s ays Je 
k=1 j=1 k=1\j=1 


which implies 
y= 2 A ipXy (j=1,...,n). (6) 


These linear relations, describing the effect of the automorphism A on the 
components of the vector x, must give a one-to-one mapping (or 
“transformation’’) of the space R,, onto itself. It follows that the square matrix || 


aj; || 18 nonsingular, i.e., has a nonvanishing determinant. Conversely, every 
nonsingular matrix || a; || specifies a one-to-one mapping of the space R,, onto 


itself, in accordance with the formulas (6), satisfying Properties a and b.t Thus, 
finally, every automorphism of the space R,, is specified by a nonsingular matrix 


|| 44 || via the formulas (6). 
We now give some examples illustrating the above considerations. The 
identity transformation I specified by the matrix 
1 0 
0 1 
2 tw 1 
carries every vector into itself. The automorphism A, specified by the matrix 


1 0 


(with all unwritten elements equal to 0) magnifies the jth component of every 
vector A; times, and is accordingly called a 4;-fold expansion along the x;-axis. 
The automorphism specified by the matrix 
Ay 0 
A; 
4 (A, Az°**A, #0) 


produces a A,-fold expansion along the x, -axis, a 2,-fold expansion along the x,- 
axis, and so on. If the numbers 4), A), ..., 2, are all equal, the corresponding 
automorphism, specified by the matrix 
A 0 
A ‘ 
(A#0) 
0 A 
carries every vector into the vector 1,. Such an automorphism is called a 
similarity transformation, with ratio of similitude 4. 


2.7. Complex Numbers 


2.71. The following question now arises quite naturally: Can an operation of 
multiplication of vectors be introduced in the space R,,, associating a vector xy 


with every pair of vectors x, y © R,,, such that the multiplication axioms a-e of 


Sec. 1.22 are satisfied? It turns out that the answer to this question is in the 
affirmative if n = 2 (cf. Sec. 2.76). One way of introducing multiplication in R, 


goes as follows. 


Let 

e1, i d, 0),€> = (0, 1). 
Then every vector z © R, can be written in the form 
Z= (x,y) = xe, + yen, 
where x and y are the components of z with respect to the basis e,,, e. Now let 

ef =e), = 4, € 162 =€2¢, = 62, (1) 
and use “linearity” to extend this operation of multiplication to all vectors z = xe, 
+ ye, © R5. To be more explicit, if w = ue, + vey, then 
zw = (xe, +ye,) (ue, + vez) 

=xue? + xve,e, +)ue,e, +yve} = (xu—yu)e, + (xv+yu)er. 

Thus the definition of multiplication in R, is just 

zw = (x, y) (u,v) = (xu —yo,xv +yu). (2) 

The product (2) obviously has the following properties: 

(a) wz = (ux — vy, vx + uy) = zw; 
(b) If, = (@, A), then 
(zw) y = (xua —yva — xvB —uyB, xuB —yoB + xva +yucx) 

= (x(ua — yo) —y(va+uB), x(v0 +uB) +9(ux—vf)) = z(wy); 
(c) R, contains an element e = (1, 0), called the unit element, such that (1, 0) (u, 
v) = (u, v) for every (u, v) © R); 
(d) For every (x, v) # (0, 0) in R, there exists an element (wu, v) = R,, called the 
reciprocal of x, such that (x, y)(u, v) = 1, namely the element 


a “| 
> x? +” x? +y7)’ 


since clearly 


x2 J ay ay 
x,y) (u,v) =| ——5 + says tas J (1,0) ; 
(9) ( ) (=5 x* + 9 x“ +y aa) (1,0) 


(e) The formula 
V(z+w)=yz+yw 
holds for every z = (x, y), w = (u, v), y = (a, £) in Ro, since 
7(z+w) = (a,B) (x+u, y +2) 
=(a(x+u) —B(y+2), B(x+u) +a(y+2)) 
= ((ax — By) + (au — Bo), (Bx +ay) + (Bu+av)) 


= (ax — By, Bx+ ay) + (au — Bo, Bu+ av) 
= (a, B) (x, ») + («,B) (u,v) = yz + yw. 


Thus multiplication in R,, as just defined, satisfies not only Axioms a-e of 
Sec. 1.22, but also all the consequences of these axioms (given in Sec. 1.4). 


2.72. Definition. The space R, equipped with the addition operation 


(x,y) + (u,v) = (e+ uy + v) 


and the multiplication operation (2) is called the field of complex numbers 
(synonmously, the complex number system or complex plane), denoted by C. 
The one-to-one correspondence 


x«+(x,0) (3) 


between the real number system R and the complex number system C preserves 
sums and products, since 


(x, 0) + (y, 0) = (x + y, 0),(x, 0) (v, 0) = (xy, 9). 


In this sense, the correspondence (3) “embeds” the system R in the system C. 
Hence the complex number (x, 0) can be identified with the real number x, a fact 
expressed by writing 


(x, 0) =x. 


In particular, the basis vector e, = (1, 0) can be denoted by 1. Denoting the 
second basis vector e, = (0, 1) by 7, we can now write the complex number (x, y) 
in the form x + iy. Using (1), we have 


i7#=¢2= —¢, =(- 1,0) = — l, 
and hence i = P| —1. Moreover, since 
CCl eo 


the number —/ is another complex square root of —1. There are no other complex 
square roots of —1, since if z2 =—1 for some z © C, then, by Sec. 1.47b (together 
with the concluding remark of Sec. 2.71), either z+ i= 0 or z—i=0, 1.e., either z 
=10rz=—1. 

In terms of the “imaginary unit” i= // —1, we can now write the multiplication 
rule (2) in the form 


(x + iy) (ut iv) = (xu — yv) +i (xv + yu), 


in keeping with the usual rule for multiplying binomials (together with the 
formula i? = -1). 

Given a complex number z = x + iy, the number x is called the real part of z, 
denoted by 


x= Rez, 
while the number y is called the imaginary part of z, denoted by 
y=Imz. 


If Re z = 0, Im z 0, the number z is said to be purely imaginary. The set of all 
complex numbers z with Im z = 0 is called the real axis, while the set of all z 
with Re z = 0 is called the imaginary axis. Because of the correspondence x<(x, 
0)<>x + 10, the real axis can be identified with the set of all real numbers (i.e., the 
“real line’’). 


2.73. The complex numbers x + iy and x — iy are said to be conjugate complex 
numbers or complex conjugates (of each other). If one of these numbers is 
denoted by z, the other is denoted by 2. 


a. rHEorEM. Zhe formula 


Z=Z (4) 
holds if and only if z is real. 
Proof. \f z= x + iy, then (4) holds if and only if 
Lis Ky, 
i. e., if and only if y = 0, in which casez=x is real. I 
b. rneorem. The formulas 


2, +2,=7,+2Z2, (5) 


2,22 =2;2Z2 (6) 
hold for arbitrary z,, Z5 © C. 
Proof. If z; =x, + ivy, 2) =X + i>, then 
Zp Z2= (xy +2) —i(9) +92) = (41-91) + (42-2) = 2, +22, 
which proves (5), while 


2422 = (% 1X2 —J1I2) —2(%1 92 +21) 
= (x, —19,)(*2 —2) =7,72, 


which proves (6). li oo 
In particular, (6) implies oz = a for all real a. 


2.74. Uniqueness of the complex number system 


THEOREM. Let R, be a two-dimensional space equipped with a multiplication 
operation satisfying the axioms of Sec. 1.22. Then R, contains two linearly 
independent vectors g, and gy such that, 


8t=8;, 82=—-81 8182 = 8281 =82- 


Proof. By Axiom c of Sec. 1.22, R, contains a nonzero unit element 1. Let g, = 
1, and let f; be any vector linearly independent of g,. Then 


{=ag, + Bf; 


for suitable real a and f. For the vector f, =f; + yg, #0 (y real) we have 
(fi +081)? =fi+ 20h +7781 = 281 + PA +2 +7781, 

and hence, choosing y = —//2, we get 

f2=(a+y")81- 


By Theorem 1.47a, the real number a + y* cannot equal 0. Moreover, a + 7 
cannot be positive, since then we would have 


(f2-Jat+72:)(fa+J/a+7s,) =0, 
contrary to Sec. 1.47b. It follows that 
a+y=-d (57 0). 


Replacing f, by 


| 
ri 3” 


Finally, the fact that g, is the unit element implies g,g5=g5g,;=2). 


Let R, and R’, both be two-dimensional spaces, each equipped with a 


multiplication operation satisfying the axioms of Sec. 1.22. Then, according to 
Theorem 2.74, we can establish a one-to-one correspondence between R, and R’, 


preserving multiplication, 1.e., the corresponding structures (two-dimensional 
spaces equipped with multiplication) are isomorphic. 


2.75. Automorphisms of the complex number system. As shown in Sec 2.59, 
the only automorphism of the real number system is the identity automorphism € 


<> €. We now turn to automorphisms of the complex number system C. By an 
automorphism of C is meant a one-to-one mapping z <> z’ of C onto itself such 
that, given arbitrary z,, z, © C and real a, 

(1) @ +2) = 2", +2’; 

(2) (21 22)" = 2" 2'9; 

(3) (az)" = a2". 

Properties 1 and 3 show that an automorphism of C as defined here is an 


automorphism of C regarded as a two-dimensional space (see Sec. 2.67), while 
Property 3 says that our automorphism must also preserve products. 


THEOREM. Zhe automorphism x + iy<>x — iy is the only automorphism of C other 
than the identity automorphism. 


Proof. Let A be an automorphism of C carrying the element x + iy into the 
element (x + iy)’. Then Property 3 implies 


l’-27=(:2)'=2Z, 


so that 1’ is the unit element of C. Hence 1’ = 1, by the uniqueness of the unit 
element. Moreover 


xX =(1-xyf=l'-x=1-x=x 


by Property 3, so that the automorphism A leaves the whole set of real numbers 


unchanged. Finally, by Property 2, 
GyperoL 
so that 7’ equals either i or —i (cf. Sec. 2.72). In the first case, 
(xt+iy)y=x'+i'=xt+b, 
while, in the second case, 
(xtiy!=x't+i'y'=x-iy. I 
The study of complex numbers will be pursued in Chapter 5 (Sec. 5.72 if). 


2.76. It is natural to ask whether a multiplication operation satisfying Axioms a— 
e of Sec. 1.22 can be defined for vectors of a general n-dimensional space R,. 


According to a theorem of Frobenius which will not be proved here, + this 
cannot be done ifn 7 2. 


2.8. Functions and Graphs 


2.81. Given two sets X and Y, suppose a unique element y © Y is associated with 
each element x ©X, in accordance with some rule or “mapping” fa fact indicated 
by writing f x y or y = f(x). ¢ Then f is called a function with domain (of 
definition) X,+ while x is called the independent variable (or argument) of the 
function f and y is called the dependent variable (or value) of f/ Following the 
customary slight abuse of notation, we will often refer to the “function f(x)” as 
well as to the “function f,” although, strictly speaking, f(x) is the value of / 
corresponding to the argument x. 

It should be emphasized that there is no need for every y © Y to be a value of / 
for some x © X, nor is there any need for distinct values of f to correspond to 
distinct values of x. However, if both of these extra conditions are satisfied, then 
f becomes a one-to-one function, establishing a one-to-one correspondence (Sec. 
2.21) between the sets X and Y, or synonymously, “mapping X onto Y in a one- 
to-one fashion.” 

If the set Y in which f(x) takes its values is the real line R, we call f(x) a 
numerical (or real) function, while if Y is the “vector space” R,, of Sec. 2.6, we 
call f(x) a vector function. If the domain X of the function f(x) is the real line R, 
the extended real line R (Sec. 1.91) or some set E— R, we call f(x) a function of 
a real variable. 


If the domain of f(x) is the set of positive integers 1, 2, 3, ..., we call f(x) a 
sequence of points in Y. In this case, we usually denote f(x) by fin) or f,. Note 


that the concept of a sequence of points in a set Y does not reduce to that of a (at 
most countable) subset of Y, since points can “repeat themselves” in a sequence 
but not in a subset. For example, the sequence 


fh, = af = fy, = 4, ... 


(where a is a fixed element of Y) is certainly not the same as the set consisting of 
the single element a, while the sequence 


f= afy = by fs = afy = dofs = a,fe = by, ... 


is certainly not the same as the countable subset {a, b, by, b3, ...} =y. 


2.82. Given any two sets X and Y, by the direct product of X and Y, written X x 
Y, we mean the set of all ordered pairs (x, y) where the first element x belongs to 
X and the second element y belongs to Y. For example, the two-dimensional 
space R, (the “xy-plane’’) is the direct product of two real lines R,; = R. Two 
elements of X x Y, 1.e., two ordered pairs (x, y) and (x’, vy’), are said to be equal if 
Kova 

Suppose we fix an element yy © Y and consider the subset of all points (x, Yo) = 
X x Y. This subset, which is obviously equivalent to the set_X itself, is called the 
section of X x Y corresponding to the element yo. The whole direct product X x Y 


is clearly the union of all the different sections of X x Y, each equivalent to X 
(and to every other section as well). 


2.83. By the graph of a given function y = f(x) with domain X and values in Y we 
mean the subset of the direct product X = Y consisting of all pairs (x, y) for 
which) y = f(x). If X = R,, Y= R,, this coincides with the usual definition of the 
graph of a numerical function of a real variable; in most cases of practical 
importance, the graph of such a function is represented by some curve in the xy- 
plane. If the domain_X of the function f(x) is the plane of points (x), x») while the 
set Y is the real line, then the graph of f(x) is some set in R; which can often be 
thought of as a surface. If the domain _X of the function f(x) is the real line while 
the set Y is the plane of points (y,, >), then the graph of f(x) is again a set in R3, 
but this time the graph is best thought of as a curve (since it intersects every 
plane x = const in just one point). The above examples constitute some of the 
most important objects studied in mathematical analysis. 


2.84. “Single-valued” versus “multiple-valued” functions. According to the 
definition of Sec. 2.81, every function f is single-valued in the sense that it 
associates a unique element y © Y with each element x © X. However, the 
expression multiple-valued function is often encountered in the literature, by 
which is meant a “function” which associates not just one but several elements y 
© Y with each x © XY. We could extend the definition of a “function” in this 
direction, but we prefer not to do so, since it would lead to difficulties in 
defining operations on such “functions.” Nevertheless, such functions will be 
found useful now and then. For example, several single valued functions are 
often combined in a single formula for the sake of brevity, and such a 
combination can be regarded as a “multiple-valued” function. Thus the “double- 
valued” function 


yoxt/1—2? (—l<x<]l) 
is simply a combination of the two single-valued functions 


yy=xt/l—x?, Jg=x—J1—x? (—l<x<]l). 


Problems 


1. Prove that the following sets are countable: 

(a) The set of all intervals a 7 x 7 b with rational end points; 

(b) The set of all polygonal lines in the plane with finitely many segments and 
vertices at rational points. 


2. Prove that the following sets are either finite or countable: 

(a) The set of all pairwise nonintersecting intervals on the line; 

(b) The set of all closed self-intersecting “figure eights” in the plane with no 
points in common; 

(c) The set M of all positive real numbers such that all finite sums 


> XX; 7 M) 
are bounded by a fixed number A. 

Comment. Analogous results have been proved for the set of all 
nonintersecting figures in the plane with triple points (like the letter 7), as well 
as for the set of all nonintersecting figures in space with singular points of the 
“button” type or pieces of the “Mobius strip” type. 

3. Represent the set of natural numbers 1, 2, ... as the union of a countable 


collection of nonintersecting countable sets. 

4 (A riddle). I. As related by the mathematician X, he was once visited by the 
brothers N, who, upon entering, took off their hats and hung them up on a rack in 
the hall. Later, when the guests were leaving and getting ready to put on their 
hats, it turned out to the host’s chagrin that one hat was missing, although 
nobody had entered the hall during the time of the visit. 

II. When the brothers N visited X on another occasion, they again hung up 
their hats on the rack in the hall. Later, when the guests were leaving and getting 
ready to put on their hats, it turned out that there was an extra hat, although both 
the host and the guests were certain that there had been no hat on the rack when 
the guests arrived. 

III. On the next visit, the guests put on their hats and left, and the host 
accompanied them to the street. Upon returning, he discovered the same number 
of hats on the rack as before the guests had left. 

IV. Finally, on still another visit, the guests arrived without hats, and, upon 
leaving, put on the hats left over from the last visit. After accompanying the 
guests to the street, the host returned to discover once again the same number of 
hats on the rack as before the guests had left. 

Explain all these seemingly paradoxical events! 

5. Suppose we add the elements of a finite or countable set B to an infinite set A. 
Prove that the resulting set is equivalent to the original set A. 

6. Prove that the set J of all irrational numbers and the set T of all transcendental 
numbers both have the power of the continuum. 

7. Prove that the set A of all sequences consisting only of zeros and ones has the 
power of the continuum. 

8. Prove that the set of all increasing sequences of natural numbers 71, 15, ... (0 < 


n, <n ~...) has the power of the continuum. 
9. Prove that the set of all sequences of natural numbers 71, 1, ... (not 


necessarily increasing) has the power of the continuum. 

10. Prove that the set of all sequences of real numbers ¢), ¢5, ... has the power of 
the continuum. 

11. Prove that the set of all points of the n-dimensional space R, (n = 1, 2, ...) has 
the power of the continuum. 

12. Prove that the set E of all functions y = f(x) on the interval [0, 1] whose range 
consists of two distinct points does not have the power of the continuum. 

13. Prove that the set of all subsets of a given set A is not equivalent to A itself. 


+ Not to be confused with the “arithmetic sum” and “arithmetic product” introduced in Problems 18 and 19 


of Chapter 1. 

+ The reader should prove these formulas as an exercise. 

t Here the sets B,, are indexed by a parameter v (think of v as ranging from 1 to 7 or over all positive 
integers or over some more general “index set’). 

+ Thus Az (k= 1, 2, ...) is the set of all fractions of the form n/k, where n = 0, +1, +2, ... 

+ See N. Bourbaki, Elements of Mathematics, Theory of Sets, Addison-Wesley Publishing Co., Reading, 
Mass. (1968), Chapter 4. 

+ Alternatively, we sometimes call x a point of Ry, with coordinates x}, ..., Xp. 


+ See e.g., G. E. Shilov, Linear Algebra (translated by R. A. Silverman), Dover Publications, Inc., N.Y. 
(1977), Theorem 3.1 2a. 


+ Shilov, Linear Algebra, Theorems 3.12a and 3.12b. 
t This section can be omitted by readers unfamiliar with elementary linear algebra. 
+ Shilov, Linear Algebra, Sec. 4.76. 


+ For the proof, see A. G. Kurosh, Lectures on General Algebra (translated by K. A. Hirsch), { Chelsea 
Publishing Co., N. Y. (1963), Sec. 38. 


t Cf. Secs. 2.21 and 2.67. 


+ We also call fa function (defined) on X. The set of all values actually taken by f, 1.e., the set {y:y = f(x), x 
E x}, is called the range of f. 


+ See V. V. Grushin and V. P. Palamodov, Uspekhi Mat. Nauk, vol. 17, no. 3 (1962), pp. 163-168. 


3 Metric Spaces 


Metric spaces are among the most important mathematical structures. In 
particular, the general theory of limits to be given in Chapter 4 will be developed 
in a metric space context. 


3.1. Definitions and Examples 


3.11. By a metric space M is meant any set of elements x, y, z, ..., called “points, 
“ equipped with a rule associating a number p(x, y), called the “distance from x 
to y, t with every pair of points x and y in M, where the rule satisfies the 
following requirements (axioms): (a) p(x, y) 7 0 if x # y, p(x, x) = 0 for every x; 
(b) p(y, x) = p(x, y) for every x and y (symmetry of the distance); (c) p(x, 2) < p(x, 
y) + p(y, z) for every x, y, z (triangle inequality). 

The triangle inequality generalizes a fact familiar from elementary geometry, 
namely that the length of the side xz of a triangle xyz does not exceed the sum of 
the lengths of the sides xy and yz. 

The rule associating the number p(x, y) with the pair of points x, y © M is 
called the metric (or distance) of the space M. Note that every subset E— M of a 
metric space M is itself a metric space, with the same metric as specified in the 
whole space M. 

We now prove a number of simple consequences of Axioms a-—c: 


a. tueorem. The inequality p(x), x,) < plxys x9) + plxa, x3) +" + py — 1 Xp) 
holds for arbitrary points x), ..., X, of a metric space M.t 


Proof. It follows from Axiom c that 


P(X 15% _) SP (X1%2) + p(X2.%q) 
< p(x,,*2) + p(x2,%3) + p(*3,%,) 
K++ SP(x1,X2) + p(X2,%3) He +P (Xp v%n)- OT 


b. rororem. The “quadrilateral inequality” 


|p(x,») — p(z,2)| <p(x,z) + p(y,2) (1) 


holds for arbitrary points x, y, Z, v of a metric space M. 


Proof. By Theorem 3.1la, we have p(x, y) S p(x, z) + p(z, v) + p(v, y), p(z, v) S$ 
p(z, x) + p(x, y) + p(y, v), or 
p(x, y) — pz, v) S p(x, z) + p(v, y), p, v) — p(x, y) S pz, x) + pQy, 0). 


The right-hand sides of the last two inequalities coincide, by Axiom b, while the 
left-hand sides differ only in sign. Thus the two inequalities together are 
equivalent to (1). 


ec. Setting o = y in the inequality (1), Ww 


e get 
|p(x,.¥) — p(z,.¥)| <p(x,z) (1’) 


(in elementary geometry, this means that the difference between the lengths of 
two sides of a triangle cannot exceed the length of the third side). 


3.12. a. A set E in a metric space M is said to be bounded if the distance p(a, x) 
from some fixed point a © M to a variable point x © E is bounded by a fixed 
constant. In this case, the distance from any other fixed point b © M to a variable 
point x © E is also bounded by a fixed constant, and the same is true of the 
distance p(x, y) between two variable points x, y © E, as follows at once from the 
triangle inequalities p(b, x) <p(b, a) + p(a, x), p(x, v) p(x, a) + pla, y). 


The quantity 
diam E=sup p(x, ») 


x,yeE 


is called the diameter of the (bounded) set E. 


b. For M = R, (the real line) boundedness of the set F in the sense of Sec. 3.12a 


is equivalent to boundedness of EF in the ordinary sense (see Sec. 1.66), since in 
this case both definitions simply mean that F lies in some (finite) closed interval. 


c. A set E in a metric space M is said to be unbounded if given any C 7 0, there 
exists a pair of points x, y © E such that p(x, y) 7 C. If E is unbounded, then, 
given any C 7 0 and any fixed point a © M, we can find a point x © E such that 
p(a, x) > C, since otherwise E would be bounded, and hence, as shown above, so 
would the distance between any pair of points x, y © E. 

To get the definition of a bounded or an unbounded metric space, we need 
only set E = M in the above definitions. 


d. The set of all points x of a metric space M whose distances from a fixed point 
Xq 18 less than a given number r > 0, i.e., such that P(x, Xo) <r, is called a ball 


(more exactly, an open ball) of radius r centered at Xp. The set of all points x 
such that p(x, Xo) Sr, is called a closed ball of radius r cented at Xg. Finally, the 
set of all points x whose distances from xg are exactly equal to 7, 1.e., such that 


p(x, XQ) =P, 
is called a sphere of radius r centered at xo. 


e. Any (open) ball centered at x, is called a neighborhood of the point xp. A 
point xg is called an interior point of a set E = M if E contains both Xq and some 
neighborhood of xo. 


3.13. Examples 


a. Any set M on the real line R, is a metric space when equipped with the 
distance p(x, y) = |x — yl. 


In fact, the validity of Axioms a—c of Sec. 3.11 follows at once from the familiar 
properties of the absolute value (in particular, see Theorem 1.54b). If M is the 


whole real line, then the open ball of radius r centered at xq is the open interval |x 
—Xxo| <r, the closed ball of radius r centered at xp is the closed interval |x — x9| S 
r, while the sphere of radius 7 centered at xp is just the pair of points x = xg +7. 


b. In just the same way, a set / in the plane R, or in the three-dimensional space 
R; is a metric space if the distance between two points x = (€), &, €).v =(11, 42, 
73) (where, to be explicit, we consider the case of R3) is defined as the usual 
geometric distanct 


p(x,9) =/(E,- )7 + (C2 —M2)7 + (C3 —M3)7- 


Here the triangle inequality (Axiom c) becomes an ordinary geometric 
inequality, i.e., the length of one side of a triangle does not exceed the sum of the 
lengths of the other two sides. A more general example is considered in the next 
section. 


3.14. a. Consider the n-dimensional space R,, (Sec. 2.61). Given any two vectors 
= (Spseee0Sn)s Y= (Miy--9Mn) (2) 


in R,, the number 


(x, 9) = > FUP (3) 


is called the scalar product of x and y. The number 


i= [DVR >0 


is called the /ength or norm of the vector x, and a vector x such that |x| = 1 is 
called a unit vector (or normalized vector). As we will see later, the presence of 
a scalar product allows us to construct a geometry in R,, with suitable lengths and 
angles. The space R, equipped with the scalar product (3) is called n- 
dimensional Euclidean space. It should be noted that the scalar product (3) has 
the following four easily verified properties (for arbitrary x, y, z © R,,): (1) @,.%) 
7 0; (2) (x,y) =, 9); 
(3) (ax, y) = a(x, y) for every real a; (4) (x + y, z) = (x, z) + (, Z). 

Next we define the distance between the vectors (2) as the length of the 
difference vector x - y, ee by the formula 


p(x,») =_ | ye (4) 


The validity of the first two axioms of Sec. 3.11 is obvious in this case. To prove 
the validity of Axiom c, 1.e., of the tringle inequality 


p(x, 9) < p(x,z) + p(y,z), (5) 


we argue as follows. The inequality (5) is equivalent to the inequality 
jx ty] <la] +l, (6) 


since (6) can be obtained from (5) by replacing y by — y and z by 0, while 
conversely, (5) can be obtained from (6) by replacing x by x — z and y by z — y. 
But Properties 1-4 above imply (6) quite generally, as shown by the following 
THEoREM. Let M be any set equipped with a scalar product satisfying Properties 
1-4. Then the inequality (6) holds for arbitrary x, y, z© M, where |x| = (x,x). 


Proof. Consider the expression 
p(A) = (x — Ay, x — dy), where / is an arbitrary real number. By Properties 2-4, 
(A) = (x, x) — 2A(x, y) + 2(y, y) = A — 2B + Ci?, where 


A = (x, x),B = (x, y),C = (VY). 


Moreover 


(4) 2 0, by Property 1. Hence the polynomial @(2) cannot have distinct real 
roots, since otherwise it would take both positive and negative values. It follows 
that the quantity B? — AC = (x, y)? — (x, x) (y, y) (appearing under the radical in 
the formula for the solution of the corresponding quadratic equation) cannot be 
positive, and hence (x, y)? S (x, x)(y, y) = |x/? |y/?. 


This implies the Schwarz inequality 
|(+5.9)| <lall I. (7) 
Using (7), we find that 
|xty|? = (x+y, x+y) = (xx) +2(x,9) + (9,3) 
< |x|? +2|x|| 9] +1 91? = (lx] +191)’, 
which is equivalent to (6). fl 


b. It follows from the above considerations that the space R, equipped with the 
distance (4) is a metric space, and so is any subset E — R,, equipped with the 


same distance. In terms of the components of the vectors (2), the inequality (7) 
takes the form 


ps ans /S a |S Nis (8) 


known as Cauchy’s inequality. Similarly, (6) becomes 
JS, Germ? |S a+ [Fat 9) 
k=1 k=1 k=1 


in oo form. Moreover, setting y = 0, z = (C), ..., G,) in the inequality (1'), 
get 


ie ee [38 < [3 Go)? &-G,)?. (10) 


The inequalities (8)-(10), valid for all real numbers ¢), ..., G5 915 +s Mae Cs 9 Eps 


are very often used in analysis to make all kinds of estimates, even quite 
independently of their geometric origin in the theory of metric spaces. 


THEOREM. The inequality 
max |¢, — | < p(x, 7) -/ YE (&—m)? <./n max |E,—n| (11) 
l<ken k=1 l<k<n 


holds for arbitrary real €), ...5 Gps M15 Nn 


Proof. Since obviously 


r—™)°< LG —m)’, 


we have 


— nl </3, (E,—m)? 


(cf. Sec. 1.55f). Taking the maximum of the left-hand side with respect to k, we 

get part of (11). On the other hand, clearly 
pB (&—m)*< Ss max (E—,m)? 

k=1 k=11isk<en 


=n max (&,—n,)* =n{ max |&,—m,|}7, 
i<k<n i<k<n 


which implies the other part of (11). a 


c. To say that a set E — R,, is bounded means that the distance between a variable 
point x = (é,, ..., €,) © E and a fixed point of R,,, say the point 0 = (0, ..., 0), is 
bounded by a fixed constant Sec. 3.12a). Thus if E is bounded, there exists a 
constant C7 0 such eat 

|x| = J y eR <C 


for all x = E. But then obviously 
l€,|<C (k= 1,...5”) (12) 


for all x © E, ice., every coordinate of the point x © E is bounded (on the real 
line). Conversely, let E = R,, be a set of points all of whose coordinates are 


bounded, i.e., such that (12) holds for some constant C 7 0. Then 
i= 3 as [Sy ct=Jac, 
ma ee 


so that E is bounded in the space R,,. Thus, finally, a set E— R,, is bounded if 


and only if the set of values of each coordinate of the points x © E is bounded (on 
the real line). 


3.15. a. Regarding metric spaces as mathematical structures and recalling the 
considerations of Sec. 2.52, we now introduce the notion of an isomorphism of 
metric spaces, called an isometry in the present context. Two metric spaces M 
and M' equipped with metrics p and p’, respectively, are said to be isometric if 
there is a one-to-one correspondence between elements x, y © M and the 
corresponding elements x’, y’ © M' which “preserves distances, “ in the sense that 


p(x, y) = p'(x', y’). 


For example, two closed intervals of equal length (on the real line) are isometric 
with the natural “metric” (distance function), but not two intervals of unequal 
length. Every plane figure is isometric to its reflection in any line. It should be 
noted that any Jinear isometry in n-dimensional Euclidean space, i.e., any 
isometry x <> x’ such that (x + y)'=x'+ y’, also preserves scalar products. In fact, 
we have (x+y,x +9) = (x,x) +2(x,9) + (9,9), 

((x+9)'s(* +9)') = (#" +958" +9") = (8) +2(05 9) + (959), 


and hence 


y=, 7), 


since 


(x,x) = |x|? =|x'|7 =(x',2'), 

(9,9) =I? =|9'1? =0',9'), 

by the very definition of an isometry. A simple example of an isometric mapping 
of the n-dimensional Euclidean space R,, into itself (an automorphism of R,,) is 
given by reflection in the plane x, = 0, which carries every vector x = (€, ..., ¢,) 
into the vector x’ = (¢), .... &,-1, ~6,). The mapping which shifts every vector in 
R,, by the vector £ = (A), ..., B,), 1-e., which carries every vector x = (€], ..., ¢,) 
into the vector x + 6 = (¢, + f, .... ¢, + B,), 18 not an automorphism of R,, (Sec. 


2.67) if 8 # 0, since it does not leave the zero vector unchanged. 


b. Suppose there exists an isometric mapping of R,, into itself carrying the set E 
— R,, into the set F — R,. Then the sets E and F are said to be congruent, in 
keeping with the terminology of elementary geometry. 


3.16. Metrization of the direct product of two metric spaces. Let /, be a 
metric space with metric p, and points x), y), ..., and let /, be another metric 
space with metric p, and points x5, >, ... Let M = M, x M, be the direct product 
of the sets M, and M, (see Sec. 2.82), 1.e., the set of all pairs x = (x), x») with x, 
© M,, x» © M,. Suppose we define a metric p on the set M by the formula 


P(X, ¥) = P((*15%2)5( P12) =max {p,(%1591),P2(*25I2)}- (13) 


THEOREM. Zhe set M equipped with the distance function (13) is a metric space. 
Proof. We must verify that (13) satisfies Axioms a—c of Sec. 3.11. If x # y, then 
either x, # y; OF Xy # yo, ie., either p;(x1, y}) 7 0 or py(X%, y>) 7 0. Hence p(x, y) 7 
0 in both cases, by the definition (13). Moreover, if x = y, then x; =), X7 =>. 
Therefore p,; (x, ¥,) = 0, Po(%>, ¥2) = 0, and hence p(x, y) = 0. Thus Axiom a 
holds. Axiom b follows at once from the observation that p(y, x) = max {p(y}, 
X1), Po(Vo>» X2)} = Max { 1(X1, V1), P2(%2, V2) = PO, Y). 


Finally, let z = (z;, Z), and consider the quantity p(x, z) = max {p;(%1, 21), Po(%, 


Zy)}- 


Suppose, to be explicit, that p(x, z) = p (x), z,). Then p(x, z) = p1 (x), 2;) <p 4, 


yp) + pi, 2) Sp, y) + p(y, 2), and similarly if p(x, y) = p2(x2, yz). This 
establishes Axiom c. 


Note that the metric (13) coincides with the metric of the space M, on every 
section (x, z) with fixed z © My, (see Sec. 2.82), so that every section (x, z) is not 
only equivalent to M, but isometric to M, as well. Note also that (13) is not the 
only way of defining a metric on the direct product of two metric spaces M, and 
M,. For example, we can also use the formula p(x, y) = p1(X1, ¥1) + Po(%, V2) or 
the formula 


p(x, 9) =s/p? (15.91) + 03(%2592) 


(give a detailed verification of Axioms a-—c in both cases). 


3.2. Open Sets 


3.21. A set G in a metric space M is said to be open if every point xp © G is an 
interior point of G, i.e., 1f whenever G contains x9, G also contains some open 
ball centered at x9 (the radius of the ball depends in general on xo). 


tueorem. The open ball U = {x © M: p(x, x1) <r} 


is an open Set. 


Proof. Given any xq © U, let p(x, x,) = @ <r, and consider any ball Up of radius 
‘0 < + — 6 centered at Xg. Then Us is (entirely) contained in U, since p(x, x1) S 


P(x, Xo) + p(X, x) < 19+ 0 <r for all x © Up, by the triangle inequality. | 


3.22. rHeorem. The union of any collection of open sets and the intersection of 
any finite collection of open sets are themselves open sets. 


Proof. The first assertion is an immediate consequence of the definition of an 
open set. As for the second assertion, suppose x, belongs to all the open sets G,, 


G>, .... G,,. Then G, contains not only xg but also some ball of radius r, (centered 
at Xp), Gy contains not only xp but also some ball of radius 7,, and so on. Then 
the ball of radius min {7), 7, ..., 7,,} centered at xp is contained in all the sets G,, 


G),, .... G,,, and hence is contained in their intersection. ff 


The proof just given does not work for the intersection of an infinite collection 
of open sets, since the minimum (more exactly, the greatest lower bound) of an 
infinite set of positive numbers may equal zero. Thus, for example, the 
intersection of the infinitely many open sets G,, = {x: p(x, X9) = I/n}(n = 1, 2, ...) 
contains only those points x such that p(x, x9) = 0, and hence, by Axiom a of Sec. 
3.11, only the single point xp. But this intersection is in general not open. 


3.23. Every open interval (a, £), bounded or unbounded, is obviously an open set 
on the real line — 00 x <0. Moreover, every finite or countable union of open 
intervalst 


(a, B,)(v = 1, 2, ...) 1s also an open set. 


THEOREM. Every open set G on the real line is a finite or countable union of 
nonintersecting open intervals. 


Proof. Let x be any point in G. Then, by definition, G contains both x and some 
open ball (1.e., some open interval) centered at x. We now construct the largest 
open interval containing x and contained in G. To this end, let EF denote the set of 
points which lie to the right of x and do not belong to G. If E' is empty, the whole 
half-line (x, ©) is contained in G, while if E' is nonempty, then E has a greatest 
lower bound 7. The point 7 certainly does not belong to G, since every point of 
G has a neighborhood entirely contained in G and hence not containing any 
points of E, while, on the other hand, every neighborhood of 7 must contain 
points of E, since 7 is the greatest lower bound of E. Thus, in particular, 7 # x. It 
is also obvious that the whole interval (x, 7) is contained in G. 

Next we carry out a similar construction to the left of the point x. This leads 
either to the half line (—o0, x) or to an interval (€, x) contained in G whose left- 
hand end point ¢ does not belong to G. Thus, starting from a given point x © G, 
we have constructed an open interval (€, 7) contained in G, neither end point of 
which belongs to G (unless it is infinite). An interval of this type is called a 
component of the open set G. 

Two components (¢,, 7,;) and (¢5, 74>) sharing a common point x9 must 
coincide. In fact, the inequality 7, = Ny (Say) is impossible, since the point 7, 
must belong to G, being an interior point of the interval (€o, 77), while at the 
same time n, cannot belong to G, being an end point of the interval (€,,n). 


Hence every set G is a union of nonintersecting components. There can be no 
more than a countable number of such components, since we can choose a 
rational point in each component (by Theorem 1.75) and the set of all rational 
points is countable (by Theorem 2.34). 


3.3. Convergent Sequences and Homeomorphisms 


3.31. Convergent sequences. A sequence 
Hy ager eghgyees (1) 


of points in a metric space M (equipped with a metric p) is said to converge to a 
point x © M if, given any «7 0, there exists an integer N* 0 such that p(x, x,) 


for all n 7 N. In other words, a sequence x, + is said to converge to a point x if 


every open ball centered at x contains all points of the sequence starting from 
some value of n (so that only a finite number of points lie outside the ball). The 
point x is called the /imit of the sequence, { and the fact that x,, converges to x is 


expressed by writing x, > x or *= lim x,. 
n-* co 
The symbol x,, — x is read “x, converges to x” or “x, approaches x” (as n > ©, 


1.e., as “n approaches infinity”). It should be borne in mind that the convergence 
is always with respect to some underlying metric p, and that the integer N will in 
general depend on ¢. A sequence x, is said to be convergent if it approaches 


some limit as n — o. Otherwise the sequence is said to be divergent (or to 
diverge). 


3.32. a. Let M_ be the real tIine R, with metric 
p(*,9) =|x—JI- (2) 


Then, in keeping with the above definition, we say that a sequence of real 
numbers xX), X,...,X,,.-. converges to a (real) limit € if, given any ¢ > (), there 
exists an integer N7 0 such that |& — nae < ¢ for alln 7 N. 

Thus, for example, the sequence of points 


on the real line, with the metric (2), converges to the point x = 0. In fact, given 


any &€ ” 0, choose an integer N 7 1/e. Then — 1 os 1 - 
"n N 


for alln 7 N, i.e., the points x, with n > N all fall in a ball of radius ¢ centered at 
the point x = 0. 


b. rHeorem.} Jf lim x, =x, lim 9, =); 
n~ © n~ co 


then 


lim p(X,5 Iq) = P(*,): 


ao 


. > : > & é 
Proof. Given any ¢~ 0, there is an N ~ 0 such that p(x,x,) < =~ P(I,In) < 5 


for all n 7 N. Hence, by the quadrilateral inequality (1), p. 53, |o(x, y) — P(Xn»s Vn)| 
Sx, x,) + 00, In) Se 
foralln7 NV. fl 


c. tTHEoreM. { If x, and y, are two numerical sequences such that 


lim x, =<, lim y, =y 
and 

X_&Inw (n= 1,2....), (3) 
then 
Ry, 
Proof. If x 7 y, let ¢ = x — y 7 0 and choose N such that 
Ik—al<5, yal <5 


> 
for alln ~ N. Then >a 5 =(948)—5>(s-5) 40-5 = 


for all n 7 N, contrary to (3), and the proof follows by contradiction. ll 


d. tHeorem. Jf x, and y, are two numerical sequences such that 


lim x, =x, lim y, =), 


n-?* 00 ao 

then 

lim (x, +y,) =*+y- 

n~* 2 

Proof. Given any &€ > (0), choose N such that |x —x,|< >> ly—y,| < ; 


for all n 7 N. Then | (x+y) — (x, +y,)| =|(x—x,) + (y—3,)| 
<|x—x,| +] 9—7,| << 


foralln7 N. Bf 


e. corottary. Jf X,, Vy, Z, are numerical sequences such _ that 


n 


lim x, =x, lim y, =); lim z,=zZ 
aw n= n-* 0 
and 


Xn t+ Vy Sz, (n = 1, 2, ...), then 
ey Sz. 
Proof. An immediate consequence of the preceding two theorems. ff 


f. rHzorem. Let M be the m-dimensional real space R,, of points x = (€), ... Gn)» Y 
= (1, +» Nm) «+» equipped with the metric [im 
" p(x) “do (Sx—m)? 


(see Sec. 3.14a). Then a sequence of points x = (&™,...,€@)© R., converges to 


a point x = (@, .. Gy) = R, if and only if 
lim 7 = €,,...,lim & =€,, (4) 
no n~ co 


i.e., if and only if each component of x converges to the corresponding 
component of x. 


Proof. Suppose (4) holds. Then, given any ¢ 7 0, we can find an N such that all 
m inequalities |g _ ¢| < ad 


ee 
Jars om <a 


hold for all n 7 N. It follows that may |g, — | < = 
isk<m 


n > N. Hence, by the second inequality in formula (11), p. 58, 
(x,%,) - [5 Gey (E&— Ef)? <./m max |, — ef” | <e 
1<k<m 


for alln 7 N, i.e., X, — x in the metric of the space R,,. 
Conversely, suppose x, — x in the metric of R,,. Then, given any ¢ > (), there 


is an N such that m 
p(x,x,) = i (E&,—€f?)? <e 


for all n 7 N. But then, by the first inequality in the formula just cited, 
max |&,—E&{”| <p(x,x,) <e, 

1<k<m 

which immediately implies (4). | 


3.33. Returning to the case of a general metric space M with metric p, we now 
prove two further results. 


a. THEOREM. The limit of a convergent sequence x,, is unique. 


Proof. Suppose both x, — x and x, — y. Then, given any ¢ 7 0, there is an N 
such that p(x, x,,) <epy, Xn) el 


for all n 7 N, and hence, by the triangle inequality, p(x, vy) < p(x, Xn) + PQ, Xp) < 
2é 


for alln 7 N. Since ¢ 7 0 is arbitrarily small, it follows from Theorem 1.57 that 
p(x, vy) = 0 and hence from Axiom b of Sec. 3.11 that x = y. 


b. tHeorem. Every convergent sequence x, is bounded. 


Proof. Let 
a= lim x,,. 
ao 


Then, for some fixed ¢ 7 0, say € = 1, choose N such that p(a, x,) <¢=1 forall n 
> N. Now let D = max {p(a, x}), ..., p(a, Xy)}- 


Then 
p(a, Xy) <max {D, 1}(n=1,2,...). fl 


3.34. Homeomorphic metric spaces. In many problems involving metric spaces 
we are interested not so much in the special form of the metric as in knowing 
which sequences are convergent and which are divergent. This leads to the 
following definition: a. Two metric spaces M and M' are said to be 
homeomorphic if there is a one-to-one correspondence x ~ x’ between the 
elements x © M and x’ © M' which carries convergent sequences into convergent 
sequences, 1.e., which is such that if x,—x in M, then x’,>x’' in M' while if in M’, 
then x,—>x in M (where x, ~ x',, x x’). + The mapping x ~ x’ is then called a 
homeomorphism. 


b. Next we give a criterion for a one-to-one correspondence between two metric 
spaces to be a homeomorphism: tHeorem. Let be a one-to-one correspondence 
between a metric space M with metric p(x, y) and a metric space M' with metric 
p'(x', y’). Then ~ is a homeomorphism if and only if, given any x © M and «7 0, 
there exists an e'* 0 such that the mapping ~ carries the ball {y © M'’: p' (x', y’) 
< ¢'} into a subset of the ball {y © M: p{x, y) <}, and conversely, given any x' © 
M' and «' ~ 0, there exists an ¢7 0 such that”~ carries the ball {y © M: p(x, y) * 
e} into a subset of the ball {y' © M’: p'(x', y')< e's. 


Proof. Suppose the mapping ~ is a homeomorphism, so that X,—>x implies 
x',—>x' and vice versa. Let the point x © M and the number ¢ 7 0 be fixed, and 
suppose there does not exist a number ¢’ with the property figuring in the 
statement of the theorem. Then the mapping ~ carries every ball {y’ © M’: p'(x’, 

y’) = 1/n}(n = 1, 2, ...) into a set with points lying outside the ball {y © M: p(x, y) 
< é}, so that, in particular, for every n = 1, 2, ... we can find a point y’, © M’ such 
that p’(x’, yi.) <1/n, p(x, V_) S&- 


But then y, does not approach x although y’, <> x’, thereby contradicting the 


assumption that ~ is a homeomorphism. Hence if ~ is a homeomorphism, it 
must be possible to find a number e’ with the property figuring in the statement 
of the theorem. Interchanging the roles of Mand M' in this argument, we can 
prove in the same way that the required number ¢ exists for every given x' © M’ 
and e’ > 0. 

Conversely, suppose we can find a suitable number ¢’ for every given x © M 


and ¢ 7 0. Then x’, x’ implies x,— x. In fact, having found ¢’, let N be such 
that p(x’, x’,) < ¢' for all n 7 N. Applying the mapping ~ and using the 
hypothesis, we get p(x, x,) < ¢ for all n 7 N. Therefore x,— x, since ¢ 7 0 is 
arbitrary. Interchanging the roles of M and M’ in this argument, we find in the 
same way that x, x implies x’,> x’. Il 


c. We can also define different metrics on one and the same set EF, thereby 
converting it into different metric spaces. Two metrics p(x, y) and p'(x’, y’) 
defined on the same set E are said to be homeomorphic (on E) if the identity 
mapping x x is ahomeomorphism of the resulting metric spaces VM and M’. As 
applied to this situation, Theorem 3.32b takes the following form: THEorEM. Two 
metrics p and p' defined on the same set E are homeomorphic if and only if given 
any x © E and ¢ 7 0, there exists an e' 7 0 such that the ball {y © E: p'(x, y) <e} 
is contained in the ball {y © E: p{x, y) <=}, and conversely, given any x © E and 
e' > 0, there exists an ¢ 7 0 such that the ball {y © E: p(x, y) < €} is contained in 
the ball {y © E: p'(x, y) Se'}. 


d. To illustrate the above considerations, consider the following three metrics 
defined on the direct product M = M, x M, of two metric spaces M, and M, (as 


in Sec. 3.16): p(x, y) = max {p1(¥1591)5 P2(x2,72)}, 


p(x, 9) = 01 (1591) + P2(*20 72), 


p(x, 9) =/ pt (* 1591) +3 (*292): 


The fact that all three metrics are homeomorphic on M follows from the chain of 


inequalities 
max {a,b} <,/a? +b? <a+b<2 max {a,b}, (5) 


valid for arbitrary numbers a 7 0, b 2 0. To prove (5), we note that if a Sb, say, 
then max {a,b} =b=,/b? <,/a? +b? <,/a* + 2ab +6? =a+b<2b=2 max {a,b}. 


3.35. a. We begin by studying a _ certain’ special function 


~ 5 (— 0 <x<oo), (6) 


which plays an important role in the main result of this section (Theorem 3.35d). 
To complete the definition of f(x), we set f(-«) = —1,f(+ «) = +1, thereby 
defining f(x) on the extended real line R. Obviously, fx) 7 0 if x 7 0 and f(x) <0 
if x < 0, while f(0) = 0. Moreover, f—x) = —f(x) for every finite x, and 


] 
Il d+ lal 


IZ()| = 1+ |x| 14+]x| 


The graph of f(x) is shown in Figure 1, from which it is apparent that f(x) is a 
one-to-one function. 


LEMMA. The inequality 
17 —S(9)| <|*—5! (7) 
holds for arbitrary real numbers x and y. 
Proof. For x 2 0, y 2 0 we have 
Me) Jol =| =a st 
Similarly, 
fe) SON =| = = RL cies 
if x < 0, y < 0, while 
fle) — NISL) +S) = pa + a <i y= ety bey) 


Figure 1 
if x <0, y 2 0, which in turn implies |f(x) — f(v)| = |Kv) — f00| S ly — x] = be — y| 
ifx?0,ySo0. 
c. ema. [fx and y are such that 
If(x)|<1-6,  |f(y)|<1-6 (0<d<1)), (8) 


then 


K-91 < If) SO). (9) 


Proof. We begin by solving (6) for x. If x 2 0, then |x| =x and 
fl) 


a 


~ L=f(x)’ 


while if x <0, then |xj=—x and ,__f(#)_ 
1 +f (x) 


py e| MO) | -SO gy 
k= ay ~ Toa" TISf@ll fon) < BY) YO 


if x 2 0, y #2 0, because of (8). This in turn implies 


Ixy =l-9-(-a) << SLM(-9) 9) = Zl) FO) 


if x © 9Q, y S 0, since then -x Z 0, =) 2 (0. Moreover 
Ife), Sy) 
1+f(x) 1-f(9) 


< sll fe) + SON] = lf) -f0) 


Ix—y| =|s| += 


eS ye l l 
aA OYA Sand BEM? Ig — yl =| y— a] < SLA) — Se) = Fale) SU) 
ifx?0,ySo0. I 


d. We now use the function f(x) to define a new metric on the extended real line 


R homeomorphic to the usual metric on the set R (the ordinary real line): 


THEOREM. Given any two points x and y of the extended real line R, let 


r(x, 9) =|f(x) -S()|, 


where f(x) is the function defined in Sec. 3.35a. Then r(x, y) is a metric on R 
which is homeomorphic to the usual metric 


p(s, 9) =|x—3I (11) 
on the set R of all finite real numbers (—0% <x <). 


Proof. The geometric meaning of the metric (10) is clear from Figure 1: The 
distance r(x, y) between two points x and y on the horizontal axis is defined as 
the length of the corresponding segment [/(x), f(y)] of the vertical axis. We begin 
by verifying that (10) satisfies Axioms a—c of Sec. 3.11. The fact that r(x, y) = 
r(y, x) is immediately apparent from (10), so that Axiom b holds. Moreover, it is 


also clear from (10), together with the inequality (7), that r(x, y) is positive if x # 
y and zero if x = y, so that Axiom a holds. To prove Axiom c, the triangle 
inequality, we merely note that r(x, z) = |Ax) — AZ| S fod) — fv)| + Wf) — AZ| = 
r(x, y) +r, Z). 


Next we verify that the metrics r(x, y) and p(x, y) are homeomorphic on the set 
R. Suppose x,—x in the metric (11). Then, given any ¢ ~ 0, there is an N such 


that |x — x,| < ¢ for all n 7 N. But then, by (7), r(x, x,) = |x) —f,)| Sx — x, | Se 


for all n < N, so that x,—>x in the metric (10). On the other hand, suppose x,—x 
in the metric (11). Noting that |f(x)| < 1, we set f(x) = 1 — 26(0 $5 <D. 


Next, given any ¢ 7 0, we find an N such that r(x, Xn) = |x) — fx,)| < § min {6, 
e} 
for all n 7 N. Then |f(x,,)| <flx)| + & min {6, e} S|fO)| + 6=1-6 


for all n 7 N, and since |f{(x)| = 1 — 26 < 1 — 6, we apply the inequality (9), 
obtaining o(x,%,) 


1) =l— a1 < s31/) “Sle)] <min (6,2) <e 
for all n 7 N, so that X,—>x in the metric (11). i 


e. Consider the set {x = R: r(x,0) Sc}, ie., the closed ball centered at the point 
oo in the metric space R equipped with the metric (10). By definition, this ball is 
the st of all x © R satisfying the inequality 


| f(co) — f(x)| = esr | <e. (12) 


Confining ourselves to the most important case, where c is small and c S 1, we 
find that only nonnegative values of x can satisfy (12), since fix) < 0 if x <0 
which implies fo) — fix) 7 1. The inequality (12) thus becomes 


x 


a Soe LE 
(44 isa” 


or equivalently, 
x> : -l. 
c 


Thus the ball {x ER: r(x,00) S$ c} is the closed interval 1 _ l<x<oo. 
c 


Similarly, the closed ball of radius c S 1 (c small) centered at the point — © is 
just the closed interval __ ,, tee ~ 1 i. 
c 


In particular, the sequence of points 
| aon aes (13) 


in the space R all lie, starting from some value of n, in an arbitrary ball centered 
at the point 00, and hence lim 7 = a 

n-* 0 
in the space R. At the same time, the sequence (13) obviously has no limit in the 
space R. 


f. It should be noted that the space R with the metric (10) is isometric to the 
closed interval [—1, 1] with the usual metric. In fact, the isometry between R and 
[-1, 1] is established by the one-to-one correspondence x ™ fix), since the 
distance between the points x, y © R with the metric (10) equals the ordinary 
distance between the corresponding points f(x) and f(y). Dropping the end points 
(— 00, «0 for the space R and —1, 1 for the interval [—1, 1]), we get an isometry 
between the real line R with the metric (10) and the open interval (—1, 1) with 
the usual metric. Using Theorem 3.35d, we find that the metric space R with the 
usual metric is homeomorphic to the interval (—1.1).+ 


3.4. Limit Points 


3.41. Let / be a metric space, equipped with a metric p, and let x, be a sequence 
of points in M. Then a point x © M is said to be a limit point of the sequence x, if, 


given any ¢ 7 0 and any integer N 7 0, there is an integer n ~ N such that p{x, 
< 
50) Naan 37 


In other words, we say that a point x is a limit point of the sequence x,, if every 
ball centered at x contains points of the sequence with arbitrarily large values of 
n, but not necessarily a// values of n (as in the case of a convergent sequence). A 
convergent sequence x, with limit x obviously has x as a limit point, and 
moreover has no other limit points (the proof is similar to that of Theorem 
3.33a). On the other hand, a divergent sequence may have any number of limit 
points or no Iimit points at all. For example, the sequence 


=(-1"(1+5) (n=1,2,...) 


of points on the real line (with the usual metric) has two limit points — 1 and 1 


(and approaches neither), while the sequence x, = n-)"(n = 1, 2, ...) has the 


single limit point 0 (which it also does not approach). As another example, 
suppose all the rational numbers are written in a single sequence (cf. Theorem 
2.34). Then every point of the real line is a limit point of this sequence! 


3.42. Given a sequence 


Kg agossphegrany (1) 


let 


x 


ae eee Saree (2) 


be any sequence of points belonging to (1), where the (distinct positive) integers 


Ny, No, .» Ny, -. are arranged in increasing order. Then (2) 1s called a 


subsequence of the original sequence (1). 


THEOREM. A point x is a limit point of a sequence x, if and only if x, has a 
subsequence converging to x. 


Proof. Suppose x is a limit point of x,. Then, given any m = 1, 2, ..., we can find 
a point x, (1; < ny <...) of the sequence x, such that p(X,%,,) < 1 
m 


But then the subsequence Xn, clearly converges to x. Conversely, if x, has a 


subsequence converging to x, then x is obviously a limit point of x,, by the very 
definition of a limit point. I 


This leads at once to the following alternative definition of a limit point: A 
point x is said to be a limit point of the sequence x, if x, has a subsequence 


converging to x. 


3.43. The second definition of a limit point rests entirely on the concept of a 
convergent sequence. Since a homeomorphism between two metric spaces M 
and M’ preserves convergent sequences, we conclude that ifx © M is a limit point 
of a sequence x,, © M and if M' is homeomorphic to M, then the point x' © M’' 
corresponding to the point x © M (under the homeomorphism) is a limit point of 
the sequence x', © M’ corresponding to the sequence x,. In particular, if two 
distinct but homeomorphic metrics p and p’ are defined on the same set M and if 
x is a limit point of a sequence x,, with respect to the metric p, then x is also a 
limit point of x, with respect to the metric p’. 


3.44. a. A point x is a metric space M (equipped with a metric p) is said to be a 
limit point of a given subset A — M if every neighborhood Ue) = ty © M: p(x, y) 
< ¢} of x contains a point y © A distinct from x itself. The definition of the limit 
point of a subset differs somewhat from the definition of a limit point of a 
sequence. This is explained by the fact that the concept of a sequence of points 
in M does not reduce to that of a subset of M, since points can “repeat 


themselves” in a sequence but not in a subset (see Sec. 2.81). For example, the 
points 0 and | are limit points of the sequence 0, 1, 0, 1, ... but not of the set 
consisting of the two points 0 and 1. 

Nevertheless, the results of Secs. 3.42 and 3.43 pertaining to limit points of a 
sequence, carry over to the case of limit points of a subset. Thus if x is a limit 
point of a set A = M, we can select a sequence of distinct points of A converging 
to x (the method of proof is the same as that of Theorem 3.42), while conversely, 
if we can select a sequence of distinct points of A converging to a point x © M, 
then x is a limit point of A. It follows, just as in Sec. 3.43, that limit points of a 
set A are preserved under homeomorphisms of M. 


b. rHeorem. Let A be a set on the real line which is bounded from above, and 
suppose & = sup A does not belong to A. Then & is a limit point of A. 


Proof. By the very definition of the least upper bound, there is a point x, © A 


such that g_! 2, <x 
nm 


for every n= 1, 2, ..., where x, # € since € € A. But then every neighborhood of é 
contains a point of A distinct from € itself. li 


The theorem remains true if A is bounded from below and € = inf A instead. 


3.45. It is desirable to have general principles allowing us to determine the 
existence of limit points of a wide class of sets or sequences. A useful result of 
this type is the following THrorem (Bolzano-Weierstrass principle). Every 
infinite set of points in a closed interval [a, b] — R has at least one limit point. 


Proof. Suppose an infinite set of points E lies in some interval [a, /]. Then at 
least one of the two halves ao 2 "| E +B 6] 
> 9 9 9 5 


of [a, 6] contains an infinite subset of £. Starting from the given interval A, = [a, 


b] and using this obvious fact, we construct a sequence of nested closed intervals 
A, > A, > ..., where each interval is half of the preceding interval and contains 


an infinite subset of the given set A. It follows from Theorem 1.82 that the 
intersection of all the intervals A,, A>, ... consists of a single point x9. To see that 


X is a limit point of A, let U= U,,(e) = {x © R: |x — xo| <5} be any neighborhood 
of x9 and let n be such that the length of the interval A, is less than 6. Then A,, 
contains xg and is entirely contained in U, so that U, like A,, contains infinitely 


many points of A, 1.e., Xp 1s a limit point of A. | 


3.46. There is an analogous result, valid for sequences: THrorem (Bolzano- 
Weierstrass principle for sequences). Every sequence of points in a closed 
interval [a, b] = R has at least one limit point. 


Proof. Repeat the proof of Theorem 3.45, taking account of the fact that points 
of a sequence can repeat themselves. 


3.47. There are sets on the real line R without limit points, e.g., the set {1, 2, ..., 
n, ...}. However, every infinite set on the extended real line R, metrized in 
accordance with Theorem 3.35d, has a limit point. This follows from the fact 
that R is isometric to the interval [-1, 1] with the ordinary metric (see Sec. 
3.35f), and the property of being a limit point is preserved under an isometry (or, 
for that matter, under a homeomorphism, as shown in Sec. 3.43). 


3.48. Next we prove a sufficient condition for the absence of limit points: 
THEOREM. Suppose the distance between any two points x,, and x, of a sequence 


X}, X>, .. is bounded from below by a _ positive constant, so that 
P(XmsX_) 2C (m,n=1,2,...). (3) 


Then the sequence x), X», ... has no limit points. 


Proof. Suppose, to the contrary, that x is a limit point of the sequence. Then the 

sequence certainly contains points x,, and x, (m # n) such that p(x, x,,) <= C/2,p(x, 
< 

A 2. 


But this is impossible, since then 
Pins Xn) PGs 2) + Gs2,) =C. 


contrary to (3). The proof now follows by contradiction. & 


3.5. Closed Sets 


3.51. A set F in a metric space M is said to be closed (in M) if it contains all its 
limit points. Thus the interval a S x Sb on the real line is closed, but not the 
interval a Sx <b which does not contain its limit point d. 


tueorem. The closed ball V = {x © M: p(x, x) Sr 


is a closed set.t 


Proof. Let x 9 be any’ poimt not in VJ, SO that 
P(*o.*1) =" >7- (1) 


Then there are no points of V in the ball of radius Ar; — r) centered at x9. 


In fact, if there were such a point, say z, then p(xo, x;) Sp(xo, z) + p(z, x1) S 2(r; 
—r)+r= 2(r; +r r;, contrary to (1). Hence xp cannot be a limit point of the 
set V, 1.e., V contains all its limit points. 


3.52. There is an intimate connection between closed sets and open sets in a 
metric space, as shown by the following rHrorem. Given a metric space M, the 
complement G of a closed set F— M is open, while the complement F of an open 
set G= Mis closed. 


Proof. Let G be the complement of a closed set F, and let x9 be any point in G. 


To prove that G is open, we must show that G contains some open ball centered 
at X9. If there were no such ball, then every ball centered at xy would contain 


points of F’. But then x9 would be a limit point of F’, and hence xg would have to 
belong to F, since F is closed. This contradicts xy © G, thereby proving that G is 
open. 

Next let F be the complement of an open set G, and let x) be any point of G. 
Then G contains x9 together with some ball centered at xo, so that x9 cannot be a 


limit point of F. Hence any limit point of F' can only be a point of F itself, 1.e., F 
is closed. 


3.53. a. rHeoreM. Every closed set F on the real line is obtained by deleting a 
finite or countable collection of nonintersecting intervals from the line. 


Proof. An immediate consequence of Theorems 3.23 and 3.52. ff 


The deleted intervals, namely the component intervals of the open set G 
complementary to F, are said to be adjacent to F. 


b. rororem. Let F be a closed set on the real line which is bounded from above, 
and let € = sup F. Then & belongs to F. 


Proof. If & € F, then, by Theorem 3.44b, & would be a limit point of F not 
belonging to F, which is impossible, since F is closed. 


The theorem remains true if F' is bounded from below, and € = inf F instead. 


3.54. tHeorem. The union of any finite collection of closed sets and the 
intersection of any collection of closed sets are themselves closed sets. 


Proof. Given closed sets F’\, (where the parameter v ranges over a finite “index 
set’), let G, be the complementary open sets, as in Theorem 3.52. Then 


cy F,=]]cer,=[16,, 


by formula (2), p. 28. But II, G, is open, by Theorem 3.22, and hence %,, F,, is 


closed, by Theorem 3.52. 
Next let v range over an arbitrary (not necessarily finite) index set. Then 


CII7.=Y.CF,=¥.G,, 


by formula (2’), p. 29. But X,, G,, is open, again by Theorem 3.22, and hence I], 
F,, is closed, again by Theorem 3.52. fl 


3.6. Dense Sets and Closures 


3.61. Definition. A set A in a metric space M is said to be (everywhere) dense 
relative to another set B — M if every point of B is either a point of A or a limit 
point of A. In other words, we say that A is dense relative to B if every ball 
centered at a point of B contains a point of A. If A is dense relative to B and if, in 
addition, 4 is a subset of B, we say that A is dense in B. For example, the set of 
all rational points is dense in the real line — 0 <x <0, while the set of all points 
(r,, .... 7,) With rational coordinates is dense in n-dimensional Euclidean space. 


3.62. The property of being dense is “transitive” in the following sense: THEOREM. 
Ifa set A is dense relative to a set B, and if B is in turn dense relative to a set C, 
then A is dense relative to C. 


Proof. Given any ¢ 7 0 and any z © C, we can find first a point y © B such that 
p(y, z) <¢/2 and then a point x © A such that p(x, y) < ¢/2. Hence, given any ¢7 0 
and any z © C, we can find a point x © A such that p(x, z) S p(x, y) + p(y, z) Se. 


But then A is dense relative toc. ff 


3.63. Definition. Given a set A in a metric space M, by the closure of A, denoted 


by A, we mean the set consisting of all points of A together with all limit points 
of A. Obviously A > A in general, where A = A if and only if every limit point of 
A is a point of A, ie., if and only if 4 is closed. If 4 — A, then, since 4 > A 
always holds, we have A = A, so that A is closed. 


3.64. rneorem. The closure A of any set A is a closed set. 


Proof. First we note that every set is dense in its own closure, by the very 
definition of closure. Thus 4 is dense in A, and A is dense in A. It follows from 
Theorem 3.62 that A is dense in A, i.e., every point of the set 4 is either a point 
of A or a limit point of A. But then 4 — A, so that 4 is closed. I 


3.65. tHeorem. The closure A of a bounded set A is itself a bounded set, and 
moreover diam A = diam A. 


Proof. Since A = A, we have 
diam A<diam A (1) 


(recall Sec. 3.12a). Given any two points ¥, ¥ © A and any ¢ 7 0, choose two 
points x,y©A such that p(x,z) <5, p97) <5. 


Then 
E 


— ia = é = 
p(X,¥) <p(X,x) + p(x, ») + p(a,7) < 5 +diam A+ 5 


which implies 


diam A= sup p(x,y) <e+diam A, 
&,jeA 


and hence 
diam A <diam A, (2) 


since ¢ 7 0 is arbitrary (in particular, 4 is bounded). Comparing (1) and (2), we 
find that diam A = diam A. 


3.66. THEOREM. Given a closed set F and an open set G > F, there exists another 
open set H> F such that F ~-H- H&G. 


Proof. Let d(x) be the distance from the point x © G to the complement of G, i.e., 
the quantity ¢(*) = a p(x, ), 
<a 


where M is the underlying metric space. Then d(x) is positive for every x © G, 
since G contains an entire neighborhood of x. With each x © F we now associate 
the open ball of radius d(x) centered at x. Let H be the union of all these balls. 
Then H is open, by Theorem 3.22, and obviously H > F. To prove that H > G, 
suppose the contrary holds. Then there exists a point y © H ()(M — G). Thus we 
can find a sequence z, © H such that z, — y and a sequence x, © F such that 


plz) < 54%) 


It follows that 


d(x) < P(X Y) <P(XarZn) + P( Zn I) S =A (%q) + P(Znsd)s 


Ni— 


and hence 


A(X) S2,Zps VP Xp Y) S 2p, y), 80 that 


y=limx, Ee FcG. 


This contradicts the assumption y © M— G, thereby proving thatH-G. If 


3.67. In particular, if F is bounded (Sec. 3.12a), we can estimate the function 

d(x) for x E F as 

d(x)= inf p(x,y)< inf [p(x,xo) + p(xo,)]<diam F+d(xo), (3) 
yeM~—-G yeM—-G 


where x, is a fixed point in /’. Hence the set H constructed in Theorem 3.66 is 


also bounded, since the definition of HA together with (3) implies 
diam H<diam F+2sup d(x) <3 diam F + 2d(x9). 


xeF 


3.7. Complete Metric Spaces 


3.71. a. Definition. A sequence of points x), x5, ... in a metric space M is called a 


fundamental sequence (or a Cauchy sequence) if, given any ¢ 7 0, there exists an 
integer N~ 0 such that P(Xm> Xp) <¢ for all m,n7 N. 


b. roeorem. Every convergent sequence is fundamental. 


Proof. Let x, be a convergent sequence, with limit x. Then, given any ¢ > 0, 
there is an N such that p(x, x,) < 2/2 


for all n 7 N. But then P(Xn9 Xp) S P(X» X) + p(X, X,) <<¢ 
for all m,n7 N,i.e., x, 18 a fundamental sequence. | 


c. THEOREM. Every fundamental sequence is bounded. 


Proof. Given a fundamental sequence x,, choose N such that p(x,,,, x,) < 1 for all 


m,n N. Then —_— max {p(x1,%y41))-+-:P(4noXn41)} ifm<N, 
P (Xm: NT is l if m> N, 


so that the distance from every point x), x5, ... to the point x, , , is bounded by a 


fixed constant. 


d. Definition. A metric space M is said to be complete if every fundamental 
sequence in M converges to an element of M. Otherwise, M is said to be 
incomplete. 


3.72. a. rHEorEM. The real line R, equipped with the usual metric p(x, y) = |x — y\, 
is a complete metric space. 


Proof. Let x, be a fundamental sequence in R. Then x, is bounded, by Theorem 
3.71c. Let 4m = inf x,, b,, =sup x, (m= 1,2,...). 

n>m nm 
Then, obviously, a), < dy, +1, bm 2 Dm + 1» 80 that [ays Dnl > [am + 1 Py + 1h 
Hence, by Theorem 1.81, there is a point p © R contained in all the intervals [as 
bul (m = 1, 2 seas We now show that 
p= lim Xa (1) 
which incidentally proves that the intersection of all the [a,,, b,,] consists of the 
single point p (cf. Theorem 3.33a). Given any ¢ 7 0, choose N such that ee e,| 


Se for all m,n 7 N. Then hold m= N + 1 fixed, and letn =N+1,N+2,... 
Since all the x, with n 2 N+ 1 are no further than ¢ from x, , ;, the same is true 


of the numbers aj, , and ba, ;. Moreover, p belongs to the interval [ay +1, by + 
1], and hence |p — xp] S by 41 — Qn 41 = (by41 — Xv + Oys1 — aye 1) S2e 


for all 7 N. Since ¢ 7 0 is arbitrary, this implies (1). li 


b. tHeorem. A numerical sequence x, is convergent if and only if it is 


fundamental, i.e., if and only if the following condition, called the Cauchy 
convergence criterion, is satisfied: Given any ¢7 0, there exists an integer N7 
0 such that |x,, <x,| <¢ for all m,n” N. 


Proof. An immediate consequence of Theorems 3.71b and 3.72a. ff 


c. The interval (0, 1) is a metric space when equipped with the usual metric of 
the real line. The sequence 1 ol 
eas 


is fundamental in this space, but has no limit in the space. Therefore (0, 1) is not 
a complete space. In particular, this shows that the property of being a complete 


space is critically dependent on the choice of a metric, and may not be preserved 
on going over to a homeomorphic metric. In fact, the real line equipped with the 
usual metric p(x, vy) = |x — y| 


is complete, as shown in Theorem 3.72a, while the real line R equipped with the 
homeomorphic metric r(x, y) = |f(x) — f(y)|, as in Theorem 3.35d, is isometric to 
the interval (—1, 1) equipped with the usual metric (see Sec. 3.35f), and hence 
fails to be complete, f as just shown. 


d. tHeorem. The m-dimensional real space R,, of all points x = {&, ... Gn)» ¥ = 
Nj, ... Np»), ..., equipped with the usual metric j= 
Denne p(x, ») =/ 2, (Se 


is complete 

Proof. Let 

xO = (EM EM) (n= 1,2...) 

be a fundamental sequence of vectors in R,,. Given any ¢ 7 0, let N be such that 


P(X ps%q) - |S aa i”) 2 <e 


for all p,q 7 WN. Then, by ~ Theorem  3.14b, 
CP — EMP 1< | YELP EM)? <e = (k= 1,...5m) 
k=1 


for all p, g > N. Thus each numerical sequence €4” (k = 1, ..., m) is fundamental 
on the real line. It follows from Theorem 3.72b that each of the limits 
Cx = lim + (k=1,...,m) 


exists. 


Now consider the vector 
x= (E1,---5Em) € Ry (2) 


By Theorem 3.14b again, we have 


ptsa)= | $ (E_— EL”)? <./m max |f,—£)”)|. (3) 
k=1 i<k<m 
Given any ¢« 7 0, choose N_ such that the m __ inequalities 
E 
on oh? |<—— (k=1,...,m) 
uo 


hold simultaneously for all p 7 N. Then (3) implies p(x, Xp) < ¢ for all p 7 N. 
Hence the sequence of vectors x) converges to the vector (2), and the space R,, 
iscomplete. ll 


3.73. a. rHeorEM. Let M be a complete metric space contained in another metric 
space P (with the same metric). Then M is closed in P. 


Proof. Let y © P be a limit point of the set M, and let x, be a sequence of points 
in M converging to y (see Sec. 3.44a). Since the sequence x, 1s fundamental, by 
Theorem 3.71b, and since the space M is complete, it follows that x, converges 
to a limit 7= lim x, 


in M. But then y =z © M, by the uniqueness of the limit (Theorem 3.33a). ff 


b. rHeorem. Let F' be a closed subset of a complete metric space M. Then F is 
complete, regarded as a metric space itself (with the metric “borrowed” from 


M). 


Proof. Every fundamental sequence y,, © F converges to a limit y © M, since M is 
complete. But y belongs to F, since Fis closed. fl 


In particular, every closed interval [a, b] on the real line is a complete metric 
space, being a closed subset (Sec. 3.51) of a complete metric space (Theorem 
3.72a). 


3.74. For the real line we have the principle of nested intervals (Theorem 1.81), 
which asserts that a system of nested closed intervals always has a nonempty 
intersection. We now consider various analogues of this principle, valid for any 


complete metric space. 


a. A set O of nonempty subsets of a set M@ with the property that given any two 
subsets A, B © O, either A B or BS A, is called a system of nested subsets. 


Lema. Let Q be any system of nested subsets of a complete metric space M, and 
suppose Q contains subsets of arbitrarily small diameter (Sec. 3.12a). Then 
there is a unique point p © M such that every neighborhood 


U,(p) ={x eM: p(x,p) <8} (4) 
of p contains some set A © Q. 


Proof. By hypothesis, given any m = 1, 2, ..., there is a subset A,, © O such that 
diam 4 — 1 ; 
™ im 


Let x,, be any point of 4,,. Ifn 7 m, then olxuxd< i 
m 


since either A,, — A, or 4, — A,,. Thus the sequence x,, is fundamental, with limit 
p= lim x,. 


n-* @ 


The point p © M satisfies the condition of the lemma. In fact, given any ¢7 0, we 
need only choose n such that both inequalities p(p,x,) < : diam A, < ; 


9” 
hold. Then 
P( Ps) <p(Ps%_) + P(%px) < 5 £ ; _ 


for every x £4 so that A, - U,(p), as required. 

To prove the uniqueness of p, suppose g # p is another point satisfying the 
condition of the lemma, and let p(p, q) = 2e 7 0. Then the neighborhoods U,(p) 
and U,(q) do not intersect, and moreover there are subsets A, B © O such that A 
= U.(p), B = U,(q). But then it is impossible for one of the subsets A and B to 
contain the other. This contradiction shows thatg=p. ll 


b. rHeorem. Let QO be any system of nested closed subsets of a complete metric 
space M, and suppose Q contains subsets of arbitrarily small diameter. Then 
there is a unique point p © M such that every neighborhood (4) of p contains 
some set A © QO, Moreover, p belongs to every set in QO. 


Proof. The first assertion follows at once from the lemma. Suppose p does not 
belong to some set B © Q. Then, since B is closed, there is an ¢ 7 0 such that the 
neighborhood U,(p) does not intersect B. By the first assertion, there is a set A = 


Q, entirely contained in U,(p). But then A cannot intersect B. This contradicts the 
fact that either 4 — B or B— A, thereby proving the second assertion. ff 


c. As a special case of Theorem 3.74b, we have the following principle of 
nested balls: Let V,, = {y EM: p(x, x,) Se,}(n = 1, 2, ...) be a sequence of nested 
closed balls in a complete metric space M such that €,, — 0 as n — ©. Then the 
intersection of all the balls V,, consists of a single point xo. 


d. Remark. A sequence of nested closed intervals on the real line always has a 
nonempty intersection, whether or not the lengths of the intervals approaches 
zero (see Theorem 1.81). However, in a metric space (even in a complete metric 
space) there can exist sequences of nested closed balls with an empty 
intersection. For example, consider the space consisting of a countable sequence 
of points x}, x2)... equipped with the metric 


l 
P(%w%n4p) =1+—- (n,p=1,2,...), 
n 
where p(x, x,,) = 0 by definition. This space satisfies all the axioms of a metric 


space. Moreover, the space is complete, since it has no nonconvergent 
fundamental sequences (in fact, there are no fundamental sequences at all 


consisting of distinct points). The closed ball V,, of radius 1 + (1/n) centered at x,, 
contains the points x,, x,, 4 1, .. and no other points, and hence V; > V, > --- > 


V, ee 


Nevertheless, the intersection 


is empty! 


3.75. a. THEOREM (Baire). Suppose a complete metric space M is the union of a 
countable number of closed subsets F,, F>, ...— M. Then at least one subset F,, 


contains a closed ball in M. 


Proof. Suppose to the contrary that none of the sets F), F , ... contains a closed 
ball, and let x, be a point not belonging to F'. Since F| is closed, there is a 


closed ball V,, {x EM: P(x, x1) < els 


which does not intersect F'). The ball V,,/. contains a point x, not be longing to 
F, (why?). Moreover, there is a closed ball V(x») which does not intersect F5, 
where it can be assumed that V,,,(x,) > Vz(x}),€; <€,/2. 


Continuing this construction indefinitely, we get a sequence of nested closed 
balls V.)(%1) = V(X) mates 


such that V,,.(x,,) does not intersect the set F,, and moreover €, — 0 as n > 00.7 


It follows from the principle of closed balls (Sec. 3.74c) that the intersection of 
all the balls V,, V3, ... consists of a single point x9 which does not belong to any 


of the sets F', F>, ... This contradicts the condition xe M= |) F,, 


thereby proving the theorem. ff 


b. Example. The set Z of all irrational points of the interval M = [a, b] cannot be 
represented as a countable union of closed subsets of M. In fact, if we had 


Z= (JF, 
n=1 
where F, F, ... — M are closed sets, then the whole interval M, which is a 


complete space (Sec. 3.73b), could be represented as a countable union of closed 
subsets of M (namely, the countable collection of sets F, F5, ..., together with 


the countable collection of all one-element sets containing a single rational point 
each). But this would contradict Baire’s theorem, since none of these subsets can 
contain a closed interval. 


c. In Theorem 2.41 we used the principle of nested intervals to prove that the set 
of all points in the unit interval [0, 1] is uncountable. We now prove a related 
result valid for a large class of complete metric spaces. First we introduce the 
following definition: A point x9 of a metric space M 1s said to be isolated if there 


is some ball {x © M: p(x, XQ) < 5} which contains no points of M other than the 
point x9 itself. For example, let WM be a set of points on the real line equipped 
with the usual metric. Then x © M is an isolated point if and only if there is an 
open interval centered at x) containing no points of M other than xp itself. 


temMa. Let M be a complete metric space consisting of only countably many 


points. Then M contains an isolated point. 


Proof. Every one-element subset of M is closed (why?). Applying Baire’s 
theorem, we see that some subset {xy} “ M contains a closed ball V, (xo). But 


this is possible only if x9 is an isolated point. fl 


d. rHrorem. Every complete metric space without isolated points is uncountable. 
Proof. An immediate consequence of the above lemma fi 


This theorem ceases to be true if we drop the condition that M have no 
isolated points. Consider, for example, any countable closed set on the line (e.g., 
any convergent sequence of points together with its limit), regarded as a metric 
space in its own right. 


3.8. Completion of a Metric Space 


3.81. The concept of a complete metric space plays a key role in the theorems of 
Secs. 3.73-3.75, and will continue to figure prominently in our subsequent 
considerations. As we now show, every incomplete metric space can be 
“embedded” in a certain complete space. 


THEOREM (Hausdorflf). Let M be a metric space, in general incomplete. Then 
there exists a complete metric space M, called the completion of M, with the 
following properties: (1) M is isometric to a subset M, = M; (2) M | is dense in 
M 


Moreover, every pair of spaces M and M satisfying Properties 1 and 2 are 
isometric. 


Proof. The proof will be given in several steps (Secs. 3.82—3.87). 


3.82. Step 1. Two fundamental sequences x,, and y, in the space M are said to be 
cofinal (with each other) if lim p(%q;%,) =9, 


where p is the metric of /. For example, any two sequences in M converging to 
the same limit are cofinal, while two sequences converging to different limits are 
noncofinal. Two fundamental sequences which are cofinal with a third sequence 
are clearly cofinal with each other. Hence the set of all fundamental sequences 
consisting of elements of M can be partitioned into classes such that all 


sequences in the same class are cofinal, while any sequence not in a given class 
is noncofinal with every sequence in the class. 

We now use these classes, denoted by_X, Y, ..., to construct a new metric space 
M, defining the distance between two classes X and Y by the formula 
p(X,Y) = lim p(x, ¥,_); I 


a 2 


where x, is any fundamental sequence from the class X and y, is any 


fundamental sequence from the class Y. First of all, we must verify that the limit 
(1) exists and is independent of the choice of the sequences x, and y, from the 


classes X and Y. It follows from the quadrilateral inequality (1), p. 53 that |p(x,, 
Vn) — Ont Vat pl S P(Xp.X, + p), and hence the numbers p(x, y,) satisfy the 


Cauchy convergence criterion, i.e., are themselves a fundamental sequence on 
the real line R. Thus the limit (1) indeed exists, by Theorem 3.72b. Moreover, if 
x’, and y’, are other fundamental sequences from the classes X and Y, 
respectively, then, by the quadrilateral inequality again, 
1? (n> In) — P (nr In) | SP (%n¥*n) + P(N In) 29 


as n —> 00, so that the sequence P(*»2n) has the same limit as the sequence p (x,,, 


y,). Thus the definition (1) of the distance between two given classes is indeed 
independent of the particular choice of fundamental sequences from the classes. 


3.83. Step 2. Next we verify that the quantity (1) satisfies Axioms a — c of Sec. 
3.11, thereby confirming that (1) is actually a metric in M First we note that p (X, 
X) = 0, as follows at once by setting x, =, in (1). Moreover, suppose p (X, Y) = 


0. This means that lim p(x,,9,) =0 


for any fundamental sequence x, in X and any fundamental sequence y, in Y. But 
then x, and y, are cofinal sequences, and hence the classes X and Y must 
coincide. It follows that p (X, Y)= 0 implies X = Y, or equivalently that p(X, Y) 7 
0 if X # Y. Together with p(X, X) = 0, this establishes Axiom a. The validity of 
Axiom b is an immediate consequence of the symmetry of distance in M, since p 


Vn Xz) = p (Xn. Vn) obviously implies 
p(Y,X) = lim p(y,,x,) = lim p(x_,¥,) = p(X,Y). 
a? @ n~7~>a@ 


As for Axiom c, let x,, y,, and z, be fundamental sequences from the classes X, Y 
and Z, respectively. Then, by the triangle inequality in M, 
P(%qsZn) <P (Xp In) + P(InrZn)- (2) 


Using Theorems 3.32e to take the limit of (2) as n — , we get the triangle 
inequality p(X, Z) Sp (X, Y) + p (Y, Z) inM, thereby verifying Axiom c. 


3.84. Step 3. We now show that M contains a subset MV ,; Isometric to the space 


M. Suppose that with each element x © M we associate the class ¥ © M 
containing the sequence x, x, ..., x, ..., 1.¢e., the class of all sequences converging 
to x. Then the set M, of all such classes X is a subset of M isometric to the 


original space M. In fact, if X and Y are the classes in M, corresponding to the 
elements x and y in M, we have P(X, ¥) = lim p(x, ») = p(x, 9), 
no 


thereby verifying Property 1. 


3.85. Step 4. To verify Property 2, 1.e., that MM, is dense in M, we argue as 
follows. Given any class X © M, let X1, Xp, +5 Xp» +, be any fundamental 


sequence in X. Consider the sequence of classes X1, X>, ..., X,, .... Where X,, 


Bie 
corresponds to the sequence x,, X,, ..., X,, ...corresponds to the element x, under 
the mapping of Sec. 3.84. Given any ¢ 7 0, let N be such that P(Xin5 i € for all 


m,n? N. Then P(X,X,) = lim p(%q)X_) <é 


ma 


for all n 7 N. But this means that the class YX is the limit of the sequence of 
classes X}, X>, ..., X;, -.. Since every X,, belongs to M, it follows that M/, is dense 
in M. 


3.86. Step 5. Next we show that the space M is complete. Let _X1, X, ..., X, Xj» + 
be a fundamental sequence of elements of M. Since M, is dense in M, for every 
class _X,, we can find a class Y,, © M, such that 0(X,,¥,) < 1 


Let y,, be the element of M corresponding to the class Y, under the mapping of 
Sec. 3.84, 1.e., the common limit of all the sequences in Y,,. Then the sequencey 


Vir Voo-Vy IS fundamental in the space M_ , __— since 
P(Im>In) =P (Yims Yn) < P(VnsXm) + P(XmsXn) + (Xn Yn) 
<p(XpX,)+— += 70 
m n 


as mn, — 0). The fundamental sequence y, yp, ..., yj, -.. determines a class Y - 
M. But this class is just the limit in M of the sequence X,, X5, ..., X,, ... In fact, 


> l 
given any €“ 0, we have jy yy « o(Y,Y,) +p(YuoX,) = lim p( Ju.) + ~<é 


for all sufficiently large n 7 N. It follows that every fundamental sequence X), 
X5, ... X;,, ... of elements of M has a limit in M ie., that M is complete. 


3.87. Final Step. To complete the proof, we show that any metric space M with 
the Properties 1 and 2 is isometric to the space M. Let M, and M, be the subsets 


of the spaces M and M isometric to the space M and hence isometric to each 
other. Given any element X © M let_X, © M, be a sequence converging to_Y. The 
corresponding sequence Y,, eM, is certainly fundamental. In fact, because of the 
isometry between M, and M,, the distance between any two elements of the 
sequence Y,, is the same as the distance between the elements of the sequence X,, 


with the same indices. Since the space M is complete, it contains the element 
Y= lim Y,,. 

n~> 2 
We now associate this element Y © M with the original element XY © M This 
uniquely determines Y, since cofinal sequences in M, correspond to cofinal 


sequences in M, and replacing X,, by a cofinal sequence leads to replacing Y,, by 


a cofinal sequence. The indicated correspondence is clearly one-to-one and 
exhausts all the elements of M and M. It only remains to show that M and M are 
isometric. Let X, X ' be elements of M, and let Y, Y ' be the corresponding 
elements of M. Suppose that sale X, X'=limX, (X,, X, € Mj), 


n~ co 


and let Y,, Y', be the elements of M, corresponding to the elements X,, X ',,. 
Then p(X,,X,) ™ P(Y,,» Y,)» 


and hence, by Theorem 3.32b, 
p(¥,¥’) = lim p(¥,,¥,) = lim p(X,,X,) =p(X,X"). il 


3.88. Let M be a metric space which is a subset of a complete metric space M*. 
Then M, the closure of M (relative to M *), can be chosen as the completion of 
M. In fact, M is complete, being a closed subset of a complete space (see 
Theorem 3.73b), and moreover M is obviously a dense subset of M. Thus 
satisfies Properties 1 and 2 of Theorem 3.81, and hence can serve as the 
completion of M. 


3.9. Compactness 


3.91. a. Definition. A metric space M is said to be compact if every sequence of 
points in M has a limit point in MW. A compact metric space M is often called a 
compactum. A metric space M 1s said to be locally compact if every point of M 
has a neighborhood whose closure is compact. 


b. rHeorem. A metric space M is compact if and only if every infinite subset E = 
M has a limit point in M. 


Proof. Suppose M is compact, and let E be any infinite subset of M. Then E 
contains a sequence x, of distinct points. This sequence has a limit point in M, 


which is clearly also a limit point of the set E. 
Conversely, suppose every infinite subset E — M has a limit point, and let a 


be any sequence of (not necessarily distinct) points of M. If the sequence 
contains only finitely many distinct points of /, at least one of these points must 
“repeat itself infinitely often,” and this point is then a limit point of the sequence. 
On the other hand, if the sequence contains infinitely many distinct points of M, 
then the infinite set consisting of these points has a limit point, and this limit 
point is also a limit point of the sequence x,,. In either case, Mis compact. fl 


This leads at once to the following alternative definition of compactness: A 
metric space M is said to be compact if every infinite subset E— M has a limit 
point in M. 


c. Examples. Any closed interval [a, b] on the real line is a compactum, by the 
Bolzano-Weierstrass principle (Theorem 3.45). The whole real line R is not a 
compact space, since, for example, the sequence 1, 2, ..., 1, ... has no limit 
points. However, R is a locally compact space, since every point x © R has a 
neighborhood whose closure is compact, namely any closed interval centered at 
x. The extended real line R with the metric r(x, y) of Theorem 3.35d is compact. 
The set of all rational points in the interval [a, b] is neither compact nor locally 
compact. 


d. As noted in Sec. 3.43, the property of being a limit point of a given sequence 
does not change if we equip the given metric space with a new metric 
homeomorphic to the original metric. More concisely, the property of being a 
limit point is invariant under transformation to a new metric homeomorphic to 
the original metric. Since the definitions of Sec. 3.91a involve only the notion of 


a limit point, we conclude that the property of being compact or locally compact 
is invariant under transformation to a new metric homeomorphic to the original 
metric. Thus the real line R is locally compact when equipped with either the 
ordinary metric p(x, y) = |x — y| or the metric r(x, y) of Theorem 3.35d. 


3.92. a. rHEoreM. Every compact metric space M is complete. 


Proof. Given any fundamental sequence x, of points inM, let x © M be the limit 
point of x, guaranteed by the compactness of M. Then, as we now show, x,, 


converges to x, thereby proving the completeness of M. In fact, given any ¢ 7 0, 
we first choose N such that p(x,,, x,) < ¢/2 for all m, n 7 N, and afterwards 


choose p 7 N such that P(Xp, X) < ¢/2. Then P(X, X) < P(Xps Xp) + p(Xp, X) <¢ 
for alln” Nie.,x, Z>xasn—>o. fl 


b. rHrorem. Every compact subset M of a metric space P is closed in P. 
Proof. An immediate consequence of Theorems 3.92a and 3.73a. fl 


Cc. THEOREM. Given a compact subset M of a metric space P and an open set G 
such that M— G& P, let the (open) set M; be the union of all open balls of 


radius 67 0 centered at points of M. Then M; = Gor a suitable value of 6. 


Proof. Suppose to the contrary that given any = 1, 2, ..., there are two points x,, 
© M and y, © P — G such that p(x,, y,) * I/n. The sequence x, lies in the 


compactum M, and hence has a limit point x © M. Moreover, by Theorem 3.42, 
the sequence x, has a subsequence Xn, converging to x, 1.e., such that Xn —> xX as 


; ] 
m — oo, Since P(In.2*) <P(Inr%n,,) +P (%n,2%) < - + p(x,_5*); 


m 


we also have y, — x asm — o. Bu t the se t P— Gis closed. Hence x EP-G, 


which is incompatible with x © M. The proof now follows by contradiction. li 


3.93. a. We now introduce a somewhat larger class of spaces, including compact 
spaces as a subclass. A metric space M is said to be precompact if every 
sequence of points in M contains a fundamental subsequence. If M is complete, 
this fundamental subsequence converges to a point in M, so that a complete 
precompact space is necessarily compact. Conversely, every compact space is 
complete (Theorem 3.92a) and obviously precompact. The open interval (a, b) 


on the real line is a simple example of a space which is precompact but not 
compact. 


b. rHeorem. Every precompact space M is bounded. 


Proof. It suffices to show that an unbounded space M is_ necessarily 
nonprecompact. If M is unbounded, then, by Sec. 3.12c, given any C7 0 and any 
point a © M, we can find a point x © M such that p(a, x) 7 C. Noting this, we fix 
a point x, © M and then inductively construct a sequence of points Xy, X3, ... in Af 


such that p(*1,*2)>1, 


n—-1 
P(XaXn4 1) 2 7: P(%ps%e41) +1 (n=2,3,...). 
k=1 


It then follows from Theorem 3. lla that 
Pl Xee%,) 2 p (x, -1 AR os P(Xms*n ~ 1) 


2 P(Xq— 19%) — [P(X m% m+) $2 + P(%n— 29%n- 1) 


n-2 
> Pn 1%n) — 2 P(X X41) > 1 
=1 


for all n 7 m. Hence the sequence x, cannot contain a fundamental subsequence, 
and the space M is nonprecompact. ff 


c. To test a metric space M for precompactness, it is sometimes convenient to 
embed M isometrically in a larger metric space P. We then call a set B— P an e- 
net for the set M — P if the distance from every point x © M to some point y © B 
(in general, depending on x) does not exceed e. In this case, the union of all 
closed balls of radius ¢ centered at the points of B contains the whole set M. 
More generally, suppose the union of all the sets L,, indexed by a parameter a 


contains a set M. Then the sets E,, are said to cover M or to form a covering of 


M. Thus a set B is said to be an ¢-net for a set M if the set of all closed balls of 
radius € centered at points of B covers the set Mt 


THEOREM (Hausdorff’s criterion). 4 subset M of a metric space P is precompact 
(in the metric p of P) if and only if, given any ¢ 7 0, P contains a finite ¢-net for 
M. 


Proof. Suppose M is precompact. Then, given any ¢ 7 0, we construct an e-net 
for M as follows. Choose any point x, © M. If every point x © M is such that p(x,, 


x) S «, the point x, is itself an e-net for M, and the construction is finished. 


However, if there are points of M whose distance from x, exceeds ¢, we choose 
one of these points as x,. If now every point x © M is such that either P(x, x) Se 
or ©(x5, x) Se, the points x, and x, form a finite e-net for M, and the construction 
is finished. Otherwise, we continue the construction, noting that the distance 
from each new point x, to each of the preceding points x), x, ..., x, — ; exceed’s 
é. Hence if the construction fails to terminate after a finite number of steps, we 
would get a sequence of distinct points x), x5, ..., X,, ... in M which certainly 
contains no fundamental subsequence, thereby contradicting the precompactness 
of M. It follows that the construction must terminate after a finite number of 
steps, resulting in a finite e-net for the set M. 

Conversely, suppose that given any ¢ 7 0, P contains a finite e-net for M, and 
let A be any infinite subset of / (in particular, a sequence containing infinitely 
many distinct points of M). Then we can select a fundamental sequence from A 
as follows: Let any point x» © A be the first point. Then, choosing ¢ = | in the 
finite e-net condition, we cover A with a finite number of closed balls of radius 
1. One of these balls, say V; must contain an infinite subset 4, — A. Let x, be 
any point in A, distinct from x9. Choosing ¢ = , we then cover A, in turn with a 
finite number of closed balls of radius +. One of them, say V>, must contain an 
infinite subset A, — A,. Let x) be any point of A, distinct from xg and xj. 


Continuing this process indefinitely, we get a sequence of infinite subsets 4 > Ay 
A A >... 
n 


(where each A, is contained in a closed ball of radius 1/n) and a sequence of 
distinct points X02 Xy, X 7p v0 X pp with x,, ee Oe This sequence is fundamental. In 
fact, if m <n, then Vin = An, = A; 


and hence 


P(X» X,) = 2/m, where the right-hand side approaches zero as m — ©. It follows 
that Mis precompact. ff 


3.94, THEOREM. Every bounded subset M of the n-dimensional Euclidean space R,, 
is precompact. 


Proof. Being bounded, M is contained in some closed ball V— R,, (Sec. 3.14c), 


and every such ball contains only finitely many points of the form k/2™ where k 
and m > 0 are integers. But, given any ¢ 7 0, the set of all such points is 
obviously an e-net for M provided m is sufficiently large. 


3.95. tHeorEM. A subset M of a metric space P is precompact if, given any ¢7 0, 
there is a (possibly infinite) precompact set B, = P which is an e-net for M. 


Proof. Let Z be a finite (¢/2)-net for the set B.. (Z exists, since B,/ is pre 


compact). Then Z is a finite e-net for the set M. In fact, given any point x © M, 
there is a point y © By Such that p(x, y) S ¢/2 and a point z © Z such that p(y, z) $ 


e/2, which together imply p(x, z) S p(x, y) + p(y, z) Se. 
Thus, given any ¢ 7 0, M has a finite c-net and hence is precompact. ff 


3.96. a. THEOREM. The completion M of any precompact metric space M is a 
compactum. 


Proof. Given any ¢ 7 0, the set M is an e-net for M, being dense in M. But M is 
precompact, and hence by Theorem 3.95, so is M. Moreover, being complete, M 
is compact as well (Sec. 3.93a). fi 


b. tHeorem. The closure M of any precompact subset M of a complete metric 
space P is compact. 


Proof. An immediate consequence of Theorem 3.96a and of the fact that the 
closure of M in P can be chosen as the completion of M (see Sec. 3.88). ff 


c. THEOREM. A precompact subset M of a complete metric space P is a compactum 
if and only if M is closed in P. 


Proof. If M is closed in P, then M is itself a complete metric space, by Theorem 


3.73b. But then, being precompact, M is compact as well (Sec. 3.93a). The 
converse assertion follows from Theorem 3.92b. 


d. rneorem. A subset M of a complete metric space P is a compactum if and only 
if M is closed in P and, given any «7 0, P contains a finite e-net for M. 


Proof. An immediate consequence of Theorem 3.93a (Hausdorff’s criterion) and 
Theorem 3.96c. 


e. THEOREM. A subset M of the space R,, is compact if and only if M is closed and 
bounded in R,,. 


Proof. If M is compact, then M is closed, by Theorem 3.92b, and bounded, by 
Theorem 3.93b. Conversely, if Mis bounded in R,, then M is precompact, by 


Theorem 3.94, and hence, by the completeness of R,, (Theorem 3.72d), the fact 
that the precompact set / is closed implies that M is compact (Theorem 3.96c). 


f. rHeorem. Every compact set M on the real line is bounded and contains its 
greatest lower and least upper bounds. 


Proof. The set M is closed and bounded, by the preceding theorem. Now use 
Theorem 3.53b (and the subsequent remark). 


3.97. rHeoreM (Finite covering theorem). Suppose a compact subset K of a 
metric space P is covered by a family 8 = {B,,\ of open subsets of P. Then K can 


also be covered by some finite subfamily B,, ..., B,, of subsets of F. 


Proof. Suppose to the contrary that no finite subfamily of F covers K. Since K is 
compact, given any ¢ 7 0, there is a finite number of closed balls V;, ..., Vin) of 


radius € covering K (by Theorem 3.93c). If each ball V; (7 = 1, ..., m,) could be 


covered by a finite subfamily of ¥, then all these subfamilies taken together 
would give a new finite subfamily of F covering K itself, contrary to hypothesis. 
Therefore at least one of the balls V; cannot be covered by a finite subfamily of 


B. Choosing ¢ = 1/n, we see that for each n = 1, 2, ... there is a closed ball 
V,,,(X,,) of radius 1\n centered at some point x, © K which cannot be covered by 


a finite subfamily of 4. Let x be a limit point of the sequence x, (since K is 
compact, x exists). Then there is a set B, © B containing both x and some 
neighborhood of x, 1.e., some open ball U,(x) of radius p centered at x (here we 


use the fact that B, is open). But then U,(x) contains balls V;/,(x,) with 
arbitrarily large values of n, so that each of these balls is covered by the single 
set B,, contrary to construction. This contradiction shows that some finite 
subfamily of F must cover K. fl 


3.98. In Sec. 3.74 we showed how the principle of nested intervals (Theorem 
1.81) can be generalized from the real line to an arbitrary complete metric space, 
provided we change nested closed intervals to nested closed subsets of arbitrarily 
small diameter. It will be recalled that the intersection of all the nested closed 
subsets may well be empty if the proviso “of arbitrarily small diameter’’ is 
dropped (see Sec. 3.74d). The following theorem shows that this cannot happen 
if the subsets are compact: rHeorem. Let QO be a system of nested (nonempty) 
compact subsets of a metric space M. Then the intersection of all the sets of Q is 
nonempty. 


Proof. Suppose the theorem is false, and choose any Ky © Q. Then for each x © 
K, there is a set K, © Q, such that x ¢ K,. Since K, is closed (by Theorem 3.92b), 
there is a neighborhood U,. of x which does not intersect K,. The set of all such 
U(x 5 Ko) covers K. Hence, by Theorem 3.97, K can be covered by a finite 
number of these neighborhoods, say Uj, ..., U,,. Let Kj, ..., K,, = QO be sets which 
do not intersect Uj, ..., U,, respectively. Then the intersection K, ... K,, has no 
points in common with any of the neighborhoods U), ..., U,,, and hence does not 
intersect the original set Ko. It follows that the intersection Kok, ... K,, is empty. 


On the other hand, since Q, is a system of nested sets, every finite intersection of 
sets of QO is again a set of Q, and in particular cannot be nonempty. This 
contradiction shows that the theorem is true. 


Problems 


1. Let A’ denote the set of all limit points of a given subset A of a metric space 
M, and let A™ = (A"™~ Dy Cm = 1, 2, ...) (AM = A, AM = A’). Given any n, 
construct a set A on the real line such that A” is nonempty while A” * !) is 
empty. 

2. Given any A = M, prove that the set A’ is closed. 

3. Given any set A on the real line such that A is countable for some n, prove 
that A itself is countable. 

4. A point x on the real line is called a condensation point of an uncountable set 


A if every neighborhood of x contains uncountably many points of A. Prove that 
every uncountable set A has condensation points. More exactly, prove that 
“almost all” points of an uncountable set A (i.e., all points of A with the possible 
exception of at most countably many points) are condensation points. 

5. Suppose a set A on the real line is covered by an arbitrary family 4 of open 
intervals. Prove that A can also be covered by a subfamily of # containing no 
more than countably many intervals. 

6. Given any subset A of a metric space M, the quantity p(*.4) mink ps2) 


is called the distance from the point x to the set A. Prove that the relations p(x, 4) 
= 0 and x © A are equivalent if A is closed, but not if A fails to be closed. 


7. Given any subset A of a metric space M, prove that the set {x © M: p(x, A) < e} 
is open, while the set 
{x © M: p(x, A) Se} 


is closed. 
8. Given two nonintersecting closed subsets F; and F, of a metric space M, 
construct nonintersecting open sets G, and G) such that G; > F,, G) > F5. 


9. Prove that the set of all closed subsets A, B, ... of a bounded metric space M is 
itself a metric space when equipped with the metric 
p(A,B) =sup{p(x,B), p(y,A)}. 
yeB 

Prove that this space is complete if the original space M is complete, and 
compact if M is compact. 

10. Given a metric space M consisting of n points, prove that if n < 4 there exists 
a metric space M’ isometric to M and contained in the Euclidean space R,, _ |. 


Prove that this assertion is in general false if n 2 4. 
11. Let R, denote the Euclidean space R, together with an extra point 0 (the 


“point at infinity”). Show that a metric 7 can be introduced in R, such that (a) r 
is homeomorphic to the usual metric p of Sec. 3.14a on R,; (b) Every sequence 
Xm © R, Which is unbounded in the usual metric has © as a limit point. 

12. Solve the analogue of Problem 11 for the case where R,, is replaced by an 


arbitrary unbounded metric space M. 
13. Suppose that in Problem 11 we use lines going through the center of the 
sphere S,, instead of the lines considered in the hint. What choice of “elements at 


infinity” now guarantees the existence of a limit point (in the new metric) for 
every sequence of points x,, . K? 


+ Because of the symmetry of the distance, we can also call p(x, vy) the “distance between x and y.” 


{ This theorem generalizes a fact familiar from elementary geometry, namely that the length of a polygonal 
line does not exceed the sum of the lengths of the segments making up the line. 


+ As always, the radical denotes the positive square root. 

{ Formula (5) is obviously an alternative version of triangle inequality of Sec. 3.11. 
+ This, of course, explains why U is called an open ball in the first place. 

+ That is, every union of a finite or countable collection of open intervals. 

+ The sequence (1), with “general term” x,,, is often simply called “the sequence xj.” 


{ The uniqueness of the limit x is proved in Theorem 3.33a. 
+ The content of this theorem might be summarized by saying that “distance is continuous” (cf. Sec. 5.12b). 


t A much more systematic treatment of limits will be given below (Chapter 4). For the time being, we 
confine ourselves to proving some simple properties of convergent numerical sequences, i.e., convergent 
sequences of real numbers (such sequences will be studied further in Sec. 4.6). 


+ That is, the numbers p(a, x,), where a is some fixed point of M, form a bounded set on the real line (cf. 
Sec. 3.12a). 

+ To avoid confusion, we now use the symbol ™ to denote a one-to-one correspondence, rather than the 
symbol < favored heretofore. 

+ Of course, this fact can easily be verified directly. 

+ This, of course, explains why V is called a closed ball in the first place. 

+ The property of being complete (or not) is obviously preserved under isometry. 

+ Note that 


1 l ! 
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+ “Suppose a lamp illuminating a ball of radius e is placed at every point of a set B which is an ¢-net for a 
set M. Then the whole set / will be illuminated.” (L. A. Liusternik) 


4 Limits 


4.1. Basic Concepts 


4.11. Given any set EF, a system S of nonempty subsets A, B, ... of E is called a 
direction (on E) if (1) Either 4 — B or B— A for every pair of sets A, B © S; (2) 
The intersection of all the sets A — S is empty. 


4.12. Let f(x) be a function defined on a set FE, taking values in a metric space M 
equipped with a distance p(Sec. 3.11). Then we say that f(x) approaches the limit 
p in the direction S if, given any ¢ 7 0, there exists a set A © S such that p(p, fx) 
< 

E 


for all x © A+ This fact is expressed by writing 


p=lim f(s) (1) 


(more concisely, f(x) §” p or just fix) > p). To say that f(x) has a limit in the 
direction S means that there is some point p © M such that f(x) 3” p. 
We now give examples illustrating these definitions (Secs. 4.13—4.16). 


4.13. Let E be the set of all positive integers 1, 2, ..., and let S be the system of 
all subsets 4, — E of the form 4, = {n,n + 1,n+2, ...}(n= 1, 2,...). 


Then obviously either A,, — 4, or A, — A,, for every pair of sets 4,,, 4, © S, 
while the intersection of all the sets A, (n = 1, 2, ...) is empty. Therefore S is a 


direction, which we denote by n — oo. Here a function y = f(x) defined on EF is 
just a sequence y, of points in the metric space M. Thus, according to the 


definition of Sec. 4.12, we say that y, approaches the limit p in the direction S, 


Le., aS n — 00, if, given any é > 0, there exists an integer NV > 0 such that p(p, 
< 
Vn) ~€ 


for all n 7 N. This definition clearly agrees with the one already given in Sec. 
3.31. In the present case, (1) takes the form /= lim »,. 


aa 


4.14. a. Let E=R,,* be the real half-line {x: x 2 a}, and let S be the system of all 


subsets A: — R,* of the form 42= {x: x? G(E7 a). 


Then S is obviously a direction, which we denote by x — +co (or simply x > ©). 
Applying the general definition of Sec. 4.12 to this case, we say that a function 
f(x) approaches the limit p in the direction S, i-e., as x > +0, if, given any ¢ 7 0, 
there exists a number & such that p (p, f(x)) <¢ 


for all x 2 &. Here the appropriate version of (1) is just P= lim f(x). 
z-* © 


b. If = R is the real line, f(x) becomes a numerical function and we get the 
following definition: A numerical function f(x) is said to approach the limit p as 
x — o if, given any ¢ 7 0, there exists a number & such that |p — f(x)| Se 


for all x 2 é. 


c. Now let E=R, be the real half-line {x: x Sa} and let S be the system of all 
subsets B- = R, of the form Be= {x: x S &} (€ Sa). 


Then S is obviously a direction, which this time we denote by x — —oo. The 
general definition of Sec. 4.12 now takes the following form: A function f(x) is 
said to approach the limit p as x > —o0 if, given any ¢ 7 0, there exists a number 
& such that p(p, f(x)) Se 


(or |p — fx)| Se in the case of a numerical function) for all x S & Formula (1) 
then becomes P= lim /(x). 

x7 —-—@ 
d. We can also introduce a direction x — +0, or equivalently |x| — , 
corresponding to the system of all subsets of the real line of the form {x: |x| Zé}, 
This time formula (1) takes the form ee = S (x). 
4.15. a. Next let E itself be a metric space, equipped with a distance py. Suppose 


a is a nonisolated point of E (cf. Sec. 3.75c), 1.e., suppose every neighborhood 
U,(a) = {x © E: po(x, a) < 0} 


contains points of E other than a itself. Then the system of deleted 
neighborhoods U;(a) = {x Ee E: 0< Po (5a) <0} (6 > 0), 


each obtained by deleting the center a from an ordinary neighborhood U;(a), 
defines a direction which we denote by x — a. Here the fact that a is nonisolated 


guarantees that every Us(a) is nonempty, while the fact that a € Usa) guarantees 
that the intersection of all the Us(a)(6 7 0) is empty.+ The definition of Sec.4.12 
now reads as follows: A function f(x) is said to approach the limit p as x — 0 if, 
given any ¢7 0, there exists a number 6 7 0 such that p(p, fix)) € 


for all x © Usa), ie., for all x satisfying the condition 0 < Polx, a) < 6. 
Correspondingly, formula (1) becomes 
po=lim f(x). (2) 


x~a 


Note that the value of the function f(x) at the point x = a plays no role in this 
definition. In fact, f(x) may not even be defined at the point x = a. 


b. In particular, if E = R, M = R, each deleted neighborhood U3(a) reduces to the 
set (a — 6, a)\U(a, a + 5), obtained by deleting the center a from the open interval 
(a — 6, a + 0), and the definition of Sec. 4.12 then reads: A numerical function 
fix) is said to approach the limit p as x — a if, given any ¢ 7 0, there exists a 
number 67 0 such that |p — fix)| Se 


for all 0 < |x — al <6. In this case, we continue to write formula (2). 


c. The examples of Secs. 4.14a—4.14c are actually special cases of the definition 
of Sec. 4.15a.t To see this, we equip the real line R with the metric of the space 
R, as in Theorem 3.35d. Then, in keeping with Sec. 3.35e, the directions x — —oo 
and x — +oo as defined in Secs. 4.14a and 4.14c are equivalent to the directions x 
— —o and x — +oo as defined in Sec. 4.15a with — 0 and + o regarded as points 


of the space R. 


4.16. Partial limits. Given a set E equipped with a direction S, suppose we fix a 
subset G = E and then consider the system of sets GA, where A is any set of the 
system S. Suppose every set GA is nonempty. Then, since the intersection of all 
the sets GA is empty (just like the intersection of all the sets A © S), the system of 
sets GA defines a new direction, which we denote by GS. The limit in the 
direction GS might be called a “partial limit,” as opposed to the “full limit” in 
the original direction S. 


a. Let f(x) be a function defined on £, taking values in a metric space M. If lim, 
f(x) exists and equals p, then obviously limgy f(x) also exists and equals p. On 
the other hand, if lime, f(x) exists, then limgy f(x) may or may not exist. The 
following theorem gives a criterion for the equivalence of full and partial limits: 


tHeoreM. Jf G contains some set B © S, then the existence of limgs f(x) implies that 
of lim, f(x) and the two limits are equal. However, if G contains no set B © § and 


if the space M contains at least two distinct points p and q #, p, then there exists 
a function f(x) such that limgs f(x) = p while lim, f(x) fails to exist. 


Proof. Suppose first that 
G>B, BeS, lim f(x) =p. 
Gs 


Then, given any ¢ 7 0, we can find a set G4 © GS such _ that 


p(p,f(*)) <e (3) 


for all x © GA (p is the metric of M). The set GA contains the set BA, which in 
turn equals either B or A and belongs to the direction S. Therefore (3) holds for 
all x © BA. It follows that lim, f(x) exists and equals p. 


Now suppose G contains no set B © S, and let H be the complement of G 
(relative to the whole set £). Introduce the function f(x) } pifxeG, 
x)= 


qifxe H, 


and let ¢ = 3p(p, q). If lim, f(x) = t existed, then we would have p(t, f(x)) < ¢ for 


all x in some set A © S. But both sets GA and HA are nonempty by hypothesis, 
and hence, choosing first x © GA and then x © HA, we would get both p(p, t) £ ¢ 
and p(q, tf) < , which together imply p(p, g) S p(p, t) + p(t, g) < 2e, contrary to 
the definition of e. It follows by contradiction that lim, f(x) fails to exist. fl 


b. Suppose limgs f(x) exists. If the function f(x) is defined only on the set G, then 
lim, f(x) is meaningless as it stands. In the case where G contains some set B © S, 
we let f-(x) denote any extension of the function f(x) from G to £,f and then set 


lim f (x) =lim f(x) (4) 
Ss AY 


by definition. Theorem 4.16a then shows that (4) makes sense and does not 
depend on how f(x) is extended from G to E. 
In particular, the limit of a sequence yj, V9, ..., V,, --. (Sec. 4.13) makes sense 


not only when y, is defined for all m = 1, 2, ..., but also when y, is defined only 
for all n greater than some positive integer mg. In the latter case, we can assign 
V1» +++» ¥y9 arbitrary values without changing the value of lim y,- 

a? @ 


Similarly, the definition of 


lim f(x) 
(Sec. 4.14a) depends only on the values of f(x) for x greater than some number 
xq, and is independent of the values of f(x) for x < xp. In just the same way, the 


definition of lim /(*) 


(Sec. 4.15a) involves only the values of f(x) “near the point x = a,” 1.e., in some 
deleted neighborhood U';9(a), no matter how small. 


c. Returning to the case where the set G intersects every A © S but contains no set 
A © S, let H be the complement of G (relative to E), as before. Consider the 
system HS of sets HA, where A © S. Since no HA is empty, the system HA is 
itself a direction, and we can talk about the existence or nonexistence of limys 
f(x). If lim, f(x) exists and equals p, then limgs f(x) and lim;;; f(x) both exist and 
equal p. However, the example given in the second part of the proof of Theorem 
4.16a shows that the existence of both limgs f(x) and limys f(x) does not imply 
that of lim, f(x). 


THEOREM. /f limgs f(x) and lim;,;<f(x) both exist and are equal, to p say, then lim, 
fix) exists and equals p. 


Proof. If 
p=lim f(x) =lim f(x), 
GS HS 


then, given any ¢ 7 0, there are sets A and B in S_ such _ that 


p(b, f(x) <e (5) 


for all x © GA, x © HB. One of the two sets A and B contains the other, say B > 
A. Then (5) certainly holds for all x © GA, a HA, and hence for all x © 4 = GA + 
HA. Since « is arbitrary, it follows that P mal (x). ff 


4.17. Behavior of limits under one-to-one mappings a. Suppose the set E is 
mapped in a one-to-one fashion onto a set F, with the element x ¢ EF going into 
the element y = w(x) © F. Given a direction S on E, consisting of subsets A = E, 
let T be the “image” of S under the mapping a, 1.e., let J be the system of all 
subsets f(A) = {vy © F: y = a(x), x © A} of F obtained as A “varies over” the 
system S. Then 7 is a direction on F, since the defining properties of a direction 


(the “nestedness” and “empty intersection” features) are obviously preserved 
under one-to-one mappings. Finally, let f(x) be a function defined on F taking 
values in a metric space M equipped with a distance p, and let g(y) be the 
function defined on F' by the formula g(yv) = g(@(x)) = f(x). 


THEOREM. The function g(y) has a limit in the direction T if and only if the function 
fix) has a limit in the direction S. If both limits exist, then ae (x) = = gy). 


Proof. Suppose 
lim g(y) =p (6) 


exists. Then, given any ¢ 7 0, there is a set B © T such that p(p, g(y)) < ¢ for all y 
< B. But then p(p, flx)) = p(p, (@(2)) Se 


for all x in the corresponding set A © S, which implies 


lim f(x) =p. (7) 


Conversely, (7) implies (6) by the symmetry of the construction. If 


b. It follows from Sec. 4.16b that the theorem remains valid if the one-to-one 
mapping is defined on some subset A © S rather than on the whole set E. 
We now give two examples illustrating these considerations. 


c. Let E be the half-line {x: x 2 a}, and let y = —x map E onto the half-line F = 
{y: y S—a}. Choose the direction x — © on F. Then the corresponding direction 
on F is obviously y — —o. It follows from Theorem 4.17a that the limits 


lim f(x), lim f(—¥) 


x 00 yr -2@ 


either both exist or both fail to exist, and that 


lim f(x)= lim f(—») (8) 
x0 yr 
if they exist. 


d. Let E be the set 0 < |x — xo| <1, and let F be the set |y| 7 1. Then the formula 
l 


J= 


establishes a one-to-one correspondence between E and F’. The direction x — x9 
on £ corresponds to the direction |y| — © on F’. It follows from Theorem 4.17a 


that the limits };., fly) i r( 1 ) 


ly| + 0 x~+xq \X—Xpo 


either both exist or both fail to exist, and that 


lim f(y) = im f( ) 


Iy|~ 2 x+xo \X—X%o 


if they exist. 


4.18. The definition of a limit given in Sec. 4.12 clearly depends on the metric p 
of the space M, a fact which can be indicated by writing /(*) >?- 


However, homeomorphic metrics (Sec. 3.34c) lead to the same limits, as shown 
by the following tHrorem. Given two homeomorphic metrics p and r defined on 
the space M, J(x*) PP if and only if f(*) >. 


Proof. Suppose J(*) ZF. Then to prove that /(*) 7P- as well, we must show that, 
given any ¢ 7 0, there is a set A in the underlying direction S such that 


r(p,f(x)) <é (9) 


for all x © A. But, according to Theorem 3.34c, given any ¢ 7 0 there isa d 7 0 
such that p(p, vy) < 6 implies r(p, y) < e. Having found 6, we choose A © S such 


that p(p, fix)) <6 


for all x © 4. But then (9) holds for all x © A, as required. To prove that /(*) =P. 
implies /(*) Peo we need only reverse the roles of p and r in the above argument. 


4.19. As we now show, the criterion for the existence of the limit of a numerical 
sequence given in Theorem 3.72b can be carried over to the present more 
general context, provided we assume that the function f(x) takes its values in a 
complete space. 


THEOREM. Let f(x) be a function defined on a set E equipped with a direction S, 
taking values in a complete metric space M equipped with a distance p. Then f(x) 
has a limit in the direction S if and only if the following condition, called the 
Cauchy convergence criterion, is satisfied: Given any ¢7 0, there exists a set A 
© § such that p(fx'), flx")) <¢ 


for all x’, x” = 7. 


Proof. Given that the Cauchy convergence criterion is satisfied, consider the 
system of all subsets (B) = {vy © M: y = f(x), x © B} of M obtained as B varies 
over the system S. Since S is a system of nested subsets, the same is obviously 
true of the system of subsets {B) — M(B © S). Moreover, according to (10), there 
are sets f(B) of arbitrarily small diameter. Hence, by Lemma 3.74a, there is a 
unique point p © M such that every neighborhood UAp) = ty © Mu: p(y, p) Se} 


of p contains some set f(A) (A © S). But then p(p, flx)) <¢ 


for all x © A, i.e., f(x) approaches p in the direction S. 
Conversely, suppose f(x) approaches p in the direction S. Then, given any ¢ 7 
0, there is a set A © S such that p(p, fix)) < ¢/2 


for all x © A. But then p(f(x’), fx") S p(p, fx’) + p(p, fx") Se 
forallx',x"©47 IF 


4.2. Some General Theorems 


As before, let E be any set equipped with a direction S, and let f(x) be a function 
defined on E, taking values in a metric space M equipped with a distance p. 


4.21. rHeorem. Suppose f(x) is constant on E, i.e., suppose fix) = p for all x © E. 
Then lim, fix) exists and equals p. 


Proof. Given any ¢ 7 0, we obviously have p(p, f(x) = p(p, p) =0 Se 
for allx in every setA©S. I 
4.22. rHrorem. The limit of f(x) in the direction S (if it exists) is unique. 


Proof. Suppose both lim, f(x) = p and lim, f(x) = q(p, ¢ © M). Then, given any ¢ 


a 0, there is a set A . S such that 
p(p, f(x)) <e (1) 
for all x © A and a sett B © S such that 
p(9,f(x)) <e (2) 


for all x © B. Suppose A — B, say. Then both inequalities (1) and (2) hold for all 
x © A, and hence p(p, q) S p(p. fx) + pq, fx) < 2« 


for all x © A. Since ¢ 7 0 is arbitrarily small, we have p(p, g) = 0 and hence p = g. 


4.23. We say that a function f(x) belongs asymptotically to a set G= M if there 
exists a set A © § such that f(x) © G for all x © A. 


THEOREM. Suppose lim, f(x) = p, and let G = M be a set containing an open ball 
centered at p. Then f(x) belongs asymptotically to G. 


Proof. Suppose G contains the ball 
U= {y © M: p(p, y) < €}, and let A © S be such that p(p, fix)) <¢ 
for all A © S. Then f(x) © US Gforallx© 4. fl 


4.3. Limits of Numerical Functions 


4.31. a. In Secs. 4.3-4.6 below we will construct a theory of limits of numerical 
functions, i.e., functions taking values on the real line (extended or not). The 
special nature of such functions is determined by the presence of a particular 
kind of metric on the real line, together with arithmetic operations and order 
relations. As a matter of fact, we know two metrics on the real line, the ordinary 
metric p(x, y) = |x — y| defined on the set R of all finite numbers, and the metric 
r(x, vy) of Theorem 3.35d defined on the extended real line R. But these two 
metrics are homeomorphic on R (see Theorem 3.35d), and hence, by Sec. 4.18, 


the existence or nonexistence of a finite limit 


p=lim f(x) (1) 


is independent of whether (1) is defined by using p or by using /. 


b. Given two functions f(x) and g(x) defined on a set E with values in R, we 
write f(x) S g(x) if Axo) S g(xo) for all x) © E. A function f(x) is said to be 
bounded from above on E if there exists a finite number C such that f(x) < C for 
all x © E. A function f(x) is said to be bounded from below on E if there exists a 
finite number C such that f(x) 2 C for all x © E. A function f(x) is said to be 
bounded (from both sides) or bounded in absolute value on E if there exists a 
finite number C such that |f{(x)| SC for all x © E. 

At every point x) © E where f(x) and g(x) are finite, we define the sum f(x) + 


g(x) as the sum of the corresponding values f(x,) and g(xo). The difference, 


product, and quotient of f(x) and g(x) are defined similarly, with the proviso that 
division is only possible at points x) where the denominator is nonvanishing. 


4.32. Pursuing the considerations of Sec. 4.23, we now introduce a number of 
further definitions together with some new notation (+ © as limits, the symbols 
O and o). 

A numerical function f(x) is said to be nonnegative (or positive) in the 
direction S if there exists a set A © S such that f(x) is nonnegative (or positive) 
for all x © A. A numerical function f(x) is said to be bounded (bounded from 
above, bounded from below) in the direction S if there exists a set A © S on which 
fix) is bounded (bounded from above, bounded from below). In the latter case, 
we write f(x) = O(1). 


Similarly, if f(x) approaches zero in the direction S,7 ie., if — (x) =0, 


we write 


{x)= o(1). 
Moreover, if given any C E R, there exists a set A © S such that f(x) 7 C for all x 
© 4, we write a = +00, 


while if there exists a set 4 © § such that fix) < C for all x © A, we write 
lim f(x) = — 00. 
Ss 


4.33. emma. The formula bef (x) =p 


is equivalent to 


lim [ f(x) —p]=0. 


Proof. An immediate consequence of the definition of a limit and the nature of 
the metric in R or R. 


4.34. a. tHeorem. Jf lim, f(x) = p, then, given any & > 0, there exists a set AE S 
such that, for allx© A,\f(x)—pl< & if pis finite, 

J(x)> lleifp=+o, 

fiz) < le if p=—o. 
Proof. An immediate consequence of Lemma 4.33 and the nature of the metric 
in R or R., 


b. THEOREM. If lim f(x) =p>0 or ia A) =+0, 
s 


then fix) is positive in the direction S, while if 
lim f(x) =p<0O or lim /f(x)=— 0, 
s Ss 


then fix) is negative in the direction S. 


Proof. The theorem follows at once from Theorem 4.34a if p = +oo. If p(#0) is 
finite, we first find an ¢ 7 0 such that the interval (p — ¢, p + ) does not contain 
the number 0 and hence consists entirely of numbers of the same sign as p itself. 
We then use Theorem 4.34a to find a set A ¢ S on which |f{(x) — p|<e. Fl 


c. The following proposition is a kind of converse of Theorem 4.34b: rHeorem /f 
f(x) is nonnegative in the direction S and if lim, f(x) = p, then p Z 0). 


Proof. If we had p <0, then by Theorem 4.34b, there would exist a set A © S on 
which f(x) < 0. On the other hand, since f(x) 2 0 in the direction S, there exists a 
set B © § on which f(x) 2 0. But this is impossible, since AB is nonempty. Hence 
p= 0, by contradiction. ll 


The theorem obviously remains true if we replace “nonnegative” by 
“nonpositive” and p 2 0 by p S0. 


d. It should be emphasized that the strict inequality 7 0 (or < 0, with equality 
excluded) is preserved in arguing from the behavior of the limit, as in Theorem 
4.34b, while only the weaker inequality 2 0 (or S 0, with equality permitted) is 
preserved in inferring the behavior of the limit, as in Theorem 4.34c. If it is 
known only that p 2 0, we can draw no conclusions at all about the behavior of 
fix) on the sets 4 © S, and in fact f(x) can take both positive and negative values 
on these sets. Moreover, even if f(x) 1s positive in the direction S, we can only 
infer that p 2 0 and not that p 7 0. 


e. We say that f(x) S g(x) in the direction S if the difference g(x) — f(x) is 
nonnegative in the direction S. Similarly, we say that f(x) < g(x) in the direction 
Sif g(x) — f(x) is positive in the direction S. 


f. rHeorem. If f(x) & g(x) in the direction S and if ay (x) =, ee g(x) =4, 
then p Sq. 

Proof. An immediate consequence of Theorem 4.34c. fi 

@. THEOREM. aad =), ne g(x) = 

and if p <q, then fix) < g(x) in the direction S. 

Proof. An immediate consequence of Theorem 4.34b. ff 

h. taeorem If f(x) S g(x) S A(x) in the direction S and if = (x) se A(x) =, 
then 


lim g(x) =p. 
s 


Proof. We need only consider the case p = 0, otherwise changing f(x) and g(x) to 
fix) — p and g(x) — p (cf. Lemma 4.33). Given any ¢ 7 0, there exists a set A © § 
on which |f(x)| Se, |h(x)| Se. But then |g(x)|Seon Ad. ft 


4.35. a. rHeorem. Suppose f(x) and g(x) are bounded in the direction S. Then so 
are f(x) + g(x) and fix)g(x). 


Proof. By hypothesis, there are sets A, B © S such that |f(x)| Sc, for all x © A and 
lg(x)| SC, for all x © B. If A = B, say, then |Ax) + g(x)| < C, + C),||Kx)g(x)| S 
CC) 


forallx©= A. I 


b. THEOREM. he) a g(x) =0, 


then 
lim [J(x) +6(%)] =0. 


Proof. By hypothesis, there are sets A, B © S such that |f(x)| < ¢/2 for all x © A 
and |g(x)| < e/2 for all x © B. If A= B, say, then |f(x) + g(x)| Se 


forallx©= A. I 


c. THEOREM. [f f(x) is bounded in the direction S and if = g(x) =0, 


then 
lim f(x) g(x) =0. 
s 


Proof. By hypothesis, there is a set A © S and a finite number C 7 0 such that 
ifx)| SC for all x © A. Then, given any ¢ 7 0, we can find a set B © S such that 
lg(x)| < e/C for all B © S. It follows that |{x)g(x)| < ¢ 


for allx© AB. fl 
4.36. a. THEOREM. if tim f(x) =p, ' g(x) =4, 


then 


= [ f(x) +a(*)]=p+4¢. 


Proof. The functions f(x) — p and g(x) — qg both approach zero in the direction S, 
and hence, by Theorem 4.35b, so does the function f(x) + g(x) — (p + g) = [f(x) - 


P| + [g(x) — q]. 
Now use Lemma 4.33. ff 


b. THEOREM. if tim f(x) a ~“ g(x) =4; 


then 


lim f(x) ¢(x) =p¢.- 


Proof. The functions f(g — g) and q(f— p) both approach zero in the direction S,7 
by Theorem 4.35c (the boundedness of fin the direction S follows from Theorem 
4.34a), and hence so does the function fg — pg = f(g — gq) + ¢(f— p). 


C. THEOREM. if tim f(*) =p#0, 


then 
1 
F(x) 
is bounded in the direction S. 


Proof. Let A © S be a set on which | f(x)| p 
2 


(see Theorem 4.34a). Then 
1 2 
——aa < on 
If(*)| P 
forallx© 4. fl 
d. rHrorem. Jf ama} (x) =p 40, 


then 
] ] 


lim —— = - 
Ss 


S(x) p 


Proof. Using the preceding theorem, Lemma 4.33, and Theorem 4.35c, we see 


approaches zero in the direction S. ff 


€. THEOREM. if lim f(x) =p#0, g(x) =4, 


then 


Proof. An immediate consequence of Theorems 4.36b and 4.36d. fi 
4.37. a. THEOREM. if Wale) =0, 


then 
lim —_— 00 

s |f(x)| 
while if 
lim | f(x)|= 0, 

s 
then 

l 

lim —— =0 

Ss I (x) 
Proof. Note that |{x)|‘<¢implies _! > l 

|f(x)| e 
while |Ax)|* Cimplies_! 21 y 
If(x)| C 

b. THEOREM. If 
lim f(x) = + 00, (2) 

s 
then 


lim [—f(x)]=— oo. 
s 


Moreover, suppose (2) holds and 
lim g(x) =p #0. 
s 


Then 
lim f(x)g (x) = +00 


ifp ~ 0, while lim f(x) () =—0 


vp 0. 
Proof. An immediate consequence of the definitions of Sec. 4.32. li 


4.38. a. Given two functions f(x) and g(x) defined on the same set E equipped 
with a direction S, we write 


I(x) =0(8(x)) (3) 


The functions f(x) and g(x) are said to be equivalent (in the direction S) if 


tim £) 21, 


s &(%) 
b. rneoren. /f f(x) = O(g(x)), thent 


g(x) 


I(x 


Proof. An immediate consequence of Theorem 4.37a. fl 


lim 
s 


== 00. (4) 


c. THEOREM. /f f(x) and g(x) are equivalent and if lim £6% =p 
s A(x) ©” 


where h(x) is another function defined on E, then 


Proof. It follows from Theorem 4.36b that 


fle) rie flaals) _ yp Sle), 08) _p 8) 
aa) s AG) 5G) 2 AG) or AG) 


Thus in calculating the limit of a quotient, we can replace the numerator (or, 
for that matter, the denominator) by an equivalent function. 


4.39. The symbol E 


a. By the symbol E(x), or briefly E, we mean any function with limit 1 in a given 
direction. In keeping with Sec. 4.23, E might be called an “asymptotic unit.” The 
product or quotient of two asymptotic units is clearly another asymptotic unit. 


b. THEOREM. ifs u(x) =0, 
then 

[1 + ux)? = 1 + pEu(x) 
for every p = 1, 2, ... 


Proof. By the binomial theorem, 


[1 +u(x)]?=1 +pu(x) + PPD) 2 (2) +++++u?(x) 


= +pu(s)| 1+ Fan) Sa) | 


=1+pEu(x), 
since the limit in brackets equals 1 (Theorems 4.36a and 4.36b). fl 


c. Asymptotic units are often useful in computing limits. For example, to 
evaluate», _(1+x)?’—(1+2 +x?)4 
x0 (1+2x)’—(1—2x4+x°)°” 
where p, q, r, and s are positive integers, we can use the above theorem to write 
(l+x)?’=l+pxE,, x+x?=x(14+x) =xE,, 
(l+x+x7)?=(1+xE,)'=1 ao 1+ 9x4, 
(1+2x)’=1+2xrE,, 2x —x> =2x(1 —4x7) =2xE,, 
(1—2x4x°)§ = (1 —2xE,)5=1 —2xsE, FE, =1—2xsEg. 


It follows that 


and hence finally 
(1+x)’—(l+x+2")" _ p—q 
x0 (1 +2x)?—(1—2x4+x7)*  2(r+s)” 
d. Consider the rational 


Ag" +a, x"~ 14-44 
ee 0, 6, #0). 
(x) box™ + by x™- b+ +5, (ag # oF ) 


function 


Clearly 


ayx agx" a,x"E, a 
R ‘v) = 0 0 0 1 = 0 n—m Fe 
) 51 b,1\  box"E, by : 
byx™( 14+ —--4+ a 
o* by x™ 
as |x| — oo. It follows from the rules of Secs. 4.36 and 4.37 that 
0 ifn<m, 
lim R(x)= 
ee (*) |joitn=m. 
0 


For n 7 m the function R(x) approaches an infinite limit whose sign depends on 
the choice of direction (x — + 0 or x — — 0), the sign of a/bp, and the evenness 


or oddness of the number n — m. The reader should examine the various 
possibilities as an exercise. 


4.4. Upper and Lower Limits 


4.41. Let f(x) be a numerical function defined on a set E equipped with a 
direction S, taking values in the extended real number system R (Sec. 1.9), and 
let a, = inf {f(x): x © A},b, = sup {flx) :x © A} 


for every A © S. Then both a, and b, exist in R and a, <b,. Since S is a system 
of nested subsets, the set of all intervals [a,, b,| (A © 5) is a system of nested 


closed intervals. Let €=SupP 44, n=inf b,. 

AeS AeS 
Then a < n and 
[é,n] = [1 [eubal, (1) 


by the generalized principle of nested intervals (Sec. 1.94). The number ¢ © Ris 
called the lower limit of f(x) in the direction S, denoted by ¢ = lim f' (x), 


while the number 7 © R is called the upper limit of f(x) in the direction S, 
denoted byn=lim f (x). 


4.42. rHeorem. Given any ¢ 7 0, there exists a set A © S such that} 


a,=inf { f(x): xe A} >E-e, 
b,=sup { f(x): xe A}<nte. 


Proof. An immediate consequence of (1). 


4.43. rueorem. Jf € = n, then f(x) has a limit (possibly infinite) in the direction S, 
equal to € =n. 


Proof. An immediate consequence of the preceding theorem. ff 


4.44, THEOREM. If 
lim f(x) =p, (2) 
s 


then the interval (1) reduces to the single point € =n = p. 


Proof. It follows from (2) and Theorem 4.19 that, given any ¢ 7 0, there is a set 
A © § such that |f(x') — Ax")| < ¢ for all x’, x” © A. Hence there is a set A © S for 
which b, — a, can be made arbitrarily small. But then ¢ = 7 (= p, by Theorem 


4.43), since otherwise b, — a, could not be made smaller than é- 7. fl 


4.45. To clarify the above concepts, we introduce another definition (cf. Sec. 
3.41). A number y © R is said to be a limit point of f(x) in the direction S if, given 


any ¢ > 0 and any set A © S, there exists a point x © A such that 
| fiz) —yl< es ifis finite, 


f(x)> Ifeify=+oo, 
SI (x) < —Ife ify=— oo. 


If lim, f(x) = p, then p is a limit point of f(x) in the direction S, but the converse is 
in general not true. 


4.46. a. tHeoreM. Every limit point y of f(x) in the direction S lies in the interval 
[Sm]. 


Proof. Clearly 

a, = inf {f(x): x © A} Sy Ssup {f(x): x © A} =, 

for every A © S, and hence » € I] [¢4.5,)=([0.]. Of 
Age 


b. rHeorem. The points ¢ and y are themselves limit points of f(x) in the direction 
S. 


Proof. If € is finite, then, given any ¢ > 0, there is a set Ag © § such that 0 Sé- 
< 
a AQ s/2 


(see Theorem 4.42). Hence, for every 4 — Ag, we have ayy S ay < &, which 
implies 
0$¢-a, “2/2. 


Moreover, since a, = inf {f(x): x © 4), there exists a point x © A such that 0 S f(x) 
—a,<<¢/2 
A 


and hence 
lx) - A Se. 
Every set B= Ay also contains such a point x, which can be chosen from the set 


Ay. Thus ¢ is a limit point of f(x) in the direction S. The proof for the case where 
& © R—R, as well as for the point 7, is along the same lines (give the details). | 


4.47. tarorem. The set of all limit points of fix) in the direction S is a nonempty 


subset of R, with greatest lower bound §=lims f/(*) and least upper bound 
n= lims f(z). 


Proof. An immediate consequence of the preceding two theorems. ff 


4.48. Note in particular that ¢, the lower limit of f(x), is just the smallest limit 
point of f(x), while 7, the upper limit of f(x), is just the /argest limit point of f(x). 
Note also that f(x) approaches the limit p in the direction Sif and only if p (= €= 
n) is the unique limit point of f(x) in the direction S. 


4.5. Nondecreasing and Nonincreasing Functions 


4.51. a. A numerical sequence y), 9, ..., ¥_. - is Said to be nondecreasing if y, S 
< eg... 


oa Sy, Syn 41 
We now introduce an analogous concept for a numerical function y = f(x) 
defined on an arbitrary set E equipped with a direction S. 


b. Definition. A numerical function f(x) is said to be nondecreasing in the 
direction S if A “= B (where A, B © _ S) implies 
sup { f(x): x €e B—A} <inf { f(x): x € A}. (1) 


4.52. Examples 
The following examples show that the general definition of Sec. 4.51b is 
consistent with the usual meaning of the term “nondecreasing.” 


a. If £ = {1, 2, ...}, then y = f(x) is a numerical sequence y, =f}, ¥> =fo, ... AS in 
Sec. 4.13, let S be the direction on F specified by the sets A, = {n,n + 1,n + 2, 
}(n = 1, 2, ...). 


Then the sequence /,, fs, ... 1s nondecreasing in the sense of Sec. 4.51b if and 


only if it is nondecreasing in the usual sense (as in Sec. 4.51a). In fact, suppose 
f\,fo, --. 8 nondecreasing in the sense of Sec. 4.51b, and choose A = A, , , = {n+ 


LA ein SA (IE ae le cteys 


Then B — A = {n}, and the condition (1) implies f, Sinf {f,+1,f, +2, ...} Sf, 
in = 1, 2, ...), so that f;, fs, ... is nondecreasing in the usual sense. Conversely, 
suppose f;, />, ... is nondecreasing in the usual sense, and let A = A, = {n,n + 1, 
w} ESAS B= {kk +1, ..3 © Sk Sn). 


Then 


sup {f{x): x © B- A} =sup {fh ..f,-1} =f,-1 Sf, = inf {f0: x © 4}, so that f;, 
f,, ... ls nondecreasing in the sense of Sec. 4.51b. 


b. Let E=R? be the real half-line {x: x 2 a}, and let S be the system of all 
subsets Ag< RF of the form Az = {x: x 2 1 (2 a), as in Sec. 4.14a. Then f(x) is 


nondecreasing in the sense of Sec. 4.51b if and only if f(x) is nondecreasing in 
the usual sense, i.e., if and only if a Sy <z implies fv) S fz). In fact, suppose 
f(x) is nondecreasing in the sense of Sec. 4.51b. Then if A, = A, (y #2), 1.8, y = 
z, we have fly) S sup {f(x): y Sx Sz} S inf {fx): x 2 z} < fd), so that fx) is 
nondecreasing in the usual sense. Conversely, let f(x) be nondecreasing in the 
usual sense, and let x’ © AAs x" =A. Then y Sx" S27 x"2z, fx’) Sfx"), and 
hence 


sup {f(x’): x’ <A, — Aj} Sinf {f"): x" = A,}, by Theorem 1.62b, so that f(x) is 
nondecreasing in the sense of Sec. 4.51b. 


4.53. rororem. Jf the function f(x) is nondecreasing and bounded from above in 
the direction S, then f(x) has a finite limit in the direction S. This limit is given by 


p=sup { f(x): xe 4}, (2) 
where A © S is any set on which f(x) is bounded from above. 


Proof. Suppose f(x) is bounded from above on the set 4 © S, and let p be the 
quantity (2). Then, given any ¢ 7 0, there is a point xy © A such that p — ¢ S f(xq) 
< 

“ ?. 


Since the intersection of all the sets of the system S is empty, there is a set BE S 
which does not contain x9. Obviously, of the two possibilities A — B,B— A, we 
have B — A in the present case. Therefore p — ¢ S Axo) S sup ffx): x S4-B\S< 
inf {f(x): x © B}, since f(x) is nondecreasing in the direction S. On the other hand, 
B= A implies sup {f(x): x © B} Ssup {fx): x © A} = p (see Theorem 1.62a). It 
follows that p — e < f(x) Sp for all x © B, and hence lim, f(x) = p. 


If f(x) is a nondecreasing function which approaches the limit p in the 
direction S, we write f(x) x P or simply fix) 7 p. 


4.54. Definition. A numerical function f(x) is said to be nonincreasing in the 


direction S if A= B (where A, B © S) implies inf {f(x): x © B — A} 2 sup {f(x): x © 
A). 


For nonincreasing functions we have the following analogue of Theorem 4.53: 
THEOREM. Jf the function f(x) is nonincreasing and bounded from below in the 
direction S, then f(x) has a finite limit in the direction S. This limit is given by p = 
inf {f(x): x ©.A), where A © S is any set on which f(x) is bounded from below. 


Proof. The function —f(x) is nondecreasing and bounded from above in the 
direction S, and hence, by Theorem 4.53, lim [—f(x)]=sup {—/(x) : x € 4}, 
s 


where A © S§ is any set on which —f(x) is bounded from below. But then 
lim f(x) = —sup {—f(x): x € A} =inf { f(x): x € A}, 
s 


by Theorem 1.61, where A © S is any set on which f(x) is bounded from above. 


If f(x) is a nonincreasing function which approaches the limit p in the direction 
S, we write f\ (x) yp or simply /(*) NP. 


4.55. rorore. /f the function f(x) is nondecreasing and unbounded from above in 


the direction S, then 
lim f(x) = + 00. (3) 
s 


Proof. Given any number C and any set B © S, there is a point X9 © B such that 
fx) > C, by the unboundedness of f(x). Holding B fixed, we now find a set 4 = 
B (A © S) which does not contain the point Xo. We then have C S sup {f(x): x =B 
— At Sinf {f(x): x © A}, since f(x) is nondecreasing in the direction S. But then 
fx) 2 C for all x © A, which implies (3). 


4.56. For nonincreasing functions we have the following analogue of the above 
theorem: tHEorEm. [f the function f(x) is nonincreasing and unbounded from 
below in the direction S, then ind (x) = — 00. 


Proof. Apply Theorem 4.55 to the function —f(x), as in the proof of Theorem 4.54 
i 


4.6. Limits of Numerical Sequences 


4.61. We now apply the general theory of Secs. 4.3-4.5 to the case of numerical 
sequences. It will be recalled from Sec. 3.22a that a numerical sequence x, © R is 


said to converge to a limit p © R if, given any ¢ 7 0, there exists an integer N 7 0 
such that |p — x,,| < ¢ for all n 7 N, a fact expressed by writing ? = lim «,. 


n-* © 


If the numbers x, are regarded as points of the extended real line R, the 


definition of a finite limit remains the same, but then the possibility of infinite 
limits arises. A sequence x, converging in R to the point +co is divergent in R. As 


opposed to other kinds of divergent sequences, such a sequence is said to diverge 
to +oo. Similarly, a sequence x,, converging in R to —co is said to diverge to —o0. 


4.62. For numerical sequences the general Cauchy convergence criterion of 
Theorem 4.19 reduces to Theorem 3.72b, 1.e., a numerical sequence x, is 


convergent if and only if given any ¢ 7 0, there exists an integer N 7 0 such that 
in ~Xql ~ efor allm,n7 N. 


4.63. a. In keeping with Sec. 4.32, a sequence x, is said to be bounded (bounded 


from above, bounded from below) as n — if there exist numbers C and N such 
that |x,| < C(x, SC, x, 2 C) for all n 7 N. Only a finite number of integers 1, 2, 


.., N fail to satisfy the condition n 7 N, and the set {x,, x), ..., Xy} is obviously 


bounded. Hence in the definition of boundedness there is no need to invoke the 
number JN or, for that matter, to write “as n — 0.” 


b. Moreover, in keeping with Secs. 4.51 and 4.54, a sequence x, is said to be 
nondecreasing as n — © if there exists an N such that x, <x, , , for alln 7 N. 
Similarly, x, 1s said to be nonincreasing as n — © if there exists an N such that 
Xy 2 X_+ 1 foralln? N. 


c. As applied to sequences, Theorem 4.53 asserts that every nondecreasing 
sequence which is bounded from above has a finite limit as n — «, According to 
Theorem 4.54, the same is true of every nonincreasing sequence which is 
bounded from below. However, a nondecreasing sequence which is unbounded 
from above diverges to +oo, while a nonincreasing sequence which is unbounded 
from below diverges to —oo (see Theorems 4.55 and 4.56). 


d. Example. Consider the sequence of _ positive numbers 


u=(1 +1)" (n=1,2,...). 
n 


By the binomial theorem, 


1\" I n(n—1) , 1 n(n—1)(n— 2) 4 
=(1+-) mL gp et het 


=14145(1- -)+ (!-3)(1-5) +> (1) 


so that uw, is a nondecreasing sequence, since both the number of terms and the 
size of all the terms except the first two increase as n — 0. Moreover, replacing 


1 = (k/n) by 1, we get 
I | ] 
_<itltstg t+ 
n+l 
40414445 one eee, (2) 
x 2 = 3 


so that the sequence u, 1s bounded from above. Hence, by Sec. 4.63, u, has a 
finite limit as n — oo. Denoting this limit by e, we have , _ jin, (1 + )" 


where 2 < e <3, because of (1) and (2). A more exact calculation gives e = 
2.71828.... It can be shown that the number e is irrational and in fact 
transcendental (Hermite’s theorem). 


4.64. Next we specie Ze the considerations of Sec. 4.4 to the case of numerical 
sequences. A point y © R is said to be a limit point of the sequence X, (as n — 00) 


if, given any ¢ 7 0 and any N 7 0, there exists an n 7 N such that 
|x,—y|< ¢& ify is finite, 


x,> Ifeify=+o0, 
x,<—Il/é ify=— 0 


(cf. Sec. 4.45). Let 
a= iit a OO, = 0p i i 


for every n = 1, 2, ... Then the set of all intervals [a,, b,] is a system of nested 
closed intervals with intersection [¢, 7], where €= sup {d, a, ...},y = inf {b,, bo, 


wd 


(cf. Sec. 4.41). Every limit point of the sequence x,, lies in the interval [¢, 7], and 


the numbers €, 7 © R are themselves limit points ofx,, (cf. Sec. 4.46). The number 
¢ is called the lower limit of x, (as n — ©), denoted by ¢= lim «,, 


while the number 7 is called the upper limit of x,, denoted byt 


» = lim x, 

n~@ 
Note that ¢ and 7 are just the smallest and largest limit points of the sequence x,,, 
respectively. Note also that x,, — p if and only if p (= ¢ = y) is the unique limit 
point of the sequence x,,. 


4.65. Limit of a sequence versus limit of a function. If a function f(x) defined 
for all x 2 Xq has a limit as x — oo, equal to p say, then, by Sec. 4.16a, the 


sequence of numbers y, = f(n), where n takes integral values greater than xo, 


approaches the same limit p (as n — 0). The converse assertion is not true: The 
existence of the limit as x > 00 of a function f(x) defined for x 2 xg cannot be 


inferred from the existence of the limit of the sequence f(m). For example, the 
function f(x) = (x), where (x) is the fractional part of x (defined in Sec. 1.71) has 
no limit as x — o, although the sequence f(n) = (n) = 0 has the limit 0. 


Nevertheless we have the following rneoreo. A function f(x) defined for all x Xo 


has a limit as x — © if and only if every sequence f{x,) has a limit, where x,, 2 Xo 


is any sequence diverging to ©. 


Proof. If f(x) approaches a limit p as x — o, then by Sec. 4.16a again, p is the 
limit of every sequence f(x,,), where x, — ©. If f(x) fails to approach a limit as x 
— oo, then f(x) does not satisfy the Cauchy convergence criterion of Theorem 
4.19. Hence for some ¢ 7 0 and any n = 1, 2, ..., there are points x, > N, x, >n 
such that 


|S (%n) —S(%n)| Be. (3) 


The sequence 


, ” , ” , yt 
Ha y¥asX%o%a0-+es%gy%poe* 


obviously diverges to «, while the sequence 
S(% 1) s LD) L0%2) sL(%2) 9+ sf Xn) LR n) 9+ 


of corresponding values of f(x) has no limit, since, as shown by (3), it does not 
satisfy the Cauchy convergence criterion. 

Of course, these considerations do not prevent us from using further properties 
of f(x) to infer the existence of lim, _, ,, f(x) from that of lim, _, ,, f(z) in special 


Cases. 


4.7. Limits of Vector Functions 


4.71. We now consider “vector functions,” namely, functions taking values in 
the n-dimensional real space R, (Sec. 2.61). Since R, is a metric space, we can 


introduce the notion of the limit in a given direction S, where the limit has the 
properties indicated in Sec. 4.2. Moreover, as we shall see in a moment, a 
number of properties of numerical functions carry over to the case of functions 
taking values in R, (n 2 2), namely, those which involve certain arithmetic 


operations but make no use of order relations. 


4.72. a. Addition of vector functions. Let f(x) and g(x) be two functions defined 
on a set E, taking values in R,,. Then by the swm f(x) + g(x) we mean the function 


whose value at every point x © F equals Axo) + g(%o). Clearly fix) + g(x) is itself 
a function on £ taking values in R,, (see Sec. 2.62). 


b. Multiplication of a vector function by a real function. Let f(x) be a function 
on E taking values in R,,, while a(x) is a function on £ taking real values. Then 


by the product a(x)f(x) we mean the function whose value at every point x9 aa 5 
equals a(xo)g(xo). Clearly a(x)f(x) is itself a function on E£ taking values in R,, 
(see Sec. 2.63). 


c. In the case n — 2, where the vector functions f(x) and g(x) can be regarded as 
complex-valued, the product /(x)g(x) and the quotient f(x) (g(x) #0) 
g(x) 


are defined with the help of the usual rules for multiplying and dividing complex 
numbers (see Sec. 2.71) as the functions whose values at every point x) © E 


equal f(x9)g(xo) and f(x)/g(xo), respectively. 
4.73. a. rHeorem. The formula RT) =peR, 


is equivalent to 


lim Lf(x)—p] =0 


or 


lim | f(x) —p| =0, 
s 


where |f(x) — p| is the norm of the vector f(x) — p. 


Proof. An immediate consequence of the definition of a limit and the nature of 
the metric in R,, (see Sec. 3.14a). 


b. A function f(x) with values in R, is said to be bounded in the direction S it 


there exists a finite number C and a set A © S such that |f(x)| SC for all x © A (ef. 
Sec. 4.32). 


THEOREM. If f(x) and g(x) are bounded in the direction S, then so is the sum f(x) + 
g(x). 


Proof. A slight generalization of the proof of Theorem 4.35a. fl 
C. THEOREM. if lim fl) =peR,, = g(x) =qeER,, 


then 

mun [ f(x) +2(x)] =p+ € R,. 

Proof. A slight generalization of the proof of Theorem 4.36a. fl 
d. THEOREM. if tioaye) =peR,, a a(x) =ceR, 


then 
i a(x) f(x) =cp. 


Proof. A slight generalization of the proof of Theorem 4.36b. ff 


e. For the case n = 2, where the vector functions f(x) and g(x) can be regarded as 
complex-valued, so that products and quotients of functions are defined, we have 
the following rHrorem. [f 


lim f(x) =p EC, lim g(x) =q eC, 
s s 


then 
ae (x)g(x) =pq 


(q#0). 


Proof. A slight generalization of the proofs of Theorems 4.36b and 4.36e. fl 


f. In the field C of all complex numbers x + iy we define the direction z — © as 


the system of all sets 4. — C of the form 4,.= {z © C: |z|7 r} 
(verify that z — oo is a direction). Then, given any function f(z) defined for z 
ro, we can talk about the limit lim /(z). 


z~ 


For example, choosing 


flz)=~  (2#0), 
Zz 


we have 


lim f(z) = lim : =0, 


z0 zw7a Zz 


since, given any € > 0, the inequality if(2= eo 


i2| 


<é 


(1) 


holds on the set 
Ay, = {z= C: |2|™ le}. 


More generally, let 


ay 


a 
aes, 
be any polynomial in 1/z with complex coefficients ag, a), ..., a,. Then 
lim f(z) = lim (3 + i $40) = 
z~ 00 270 Zz Zz 


by (1) together with Theorems 4.73c and 4.73e. 


4.74. tHeorem. A function f(x) with values in R,, has a limit in the direction S if 


and only if the following condition, called the Cauchy convergence criterion, is 
satisfied: Given any ¢ 7 0, there exists a set A © S such that |x’) — fix")| <¢ 


for all x', x" © A. 


Proof. Specialize Theorem 4.19, observing that R,, is complete (Theorem 3.72d). 


i 
4.75. A function f(x) with values in R,, can be written in the form f(x) = (f{(), ..., 
f,(x)), Where the functions f(x), ..., f,(x) are numerical functions, being 


components of the vector function f(x). 


THEOREM. A function f(x) with values in R, has a limit in the direction S if and 
only if each component function f(x), ..., f(x) has a limit in the direction S. 


Proof. Use the fact that 
max | file’) ~file")|< 13 [ale) Ale? < Yl) Ale") 


(see Theorem 3.14b), together with the Cauchy convergence criterion. ff 


Problems 


1. Prove that if the sequence x,, © R is convergent, so is the sequence Ix,,|. Is the 
converse true? 

2. Given arbitrary real sequences a 
lim a,+lim 6, <Iim (a,+4,) <lim a,+lim 6,, 
lim a,+lim 6,>lim (a, +6,) >lim a, + lim 6,. 

3. Prove that if a sequence a, converges to the limit p, then so does any of its 


and = b prove _ that 


n n? 


rearrangements A), Ay, «+5 Ans + 


Does the convergence of the sequence follow from the convergence of one of its 
rearrangements? 

4. Prove that if the sequence a 
lim (a, Se aabesee + Tim 6, 


no 


5. Prove that if lim (a, +6.) = Im m a,+ lim 6, 


n~ © a~ a~ oo 


n iS convergent, then 


for a given sequence a, and an arbitrary sequence b,, then a, is convergent. 
6. Given that x, =a, x, =b, x3=4(a+5),..., %,=3(X_—1 $%_—2) 5-08 
find lim, _, 
“ > : 
7. Suppose a 0, X97 O and ice 3(%+ *) (n=0,1,2,...). 


Prove that 
lim x, =,/a. 
no 


> bh? 


8. Suppose a O and x, =a, 9, 2b,...5 Xpa4 le Jur Jno 1 =E(%o FInd 0° 


Prove that the sequences x, and y, have the same limit. (Gauss) 9. Given that 
max {P}, .--, Pm} =Pi(P > 0, re Oe > 0), prove that 


lim od y bi=h- 
n> co k=1 


10. A straight line y = Ax + b is called an asymptote of a curve y = f(x) defined 
for all sufficiently large x if lim [ f(x) — (Ax + 6)]=0. 


Prove that the curve y = f(x) has an asymptote as x — oo if and only if both limits 
k= lim Sl) b= lim | A) —x lim ff) 
: os | 


x+0o * xo x00 


exist. 
11. Let f(x) be a function defined on a metric space M equipped with a distance 
Po, taking values in a metric space P equipped with a distance p. Suppose we 


define the limit of f(x) as x — a by using “full neighborhoods” U,(6) = {x EM: 
Pox, a) < 6} of the point a (which include the point a itself) rather than the 
deleted neighborhoods U',(0) of Sec. 4.15a. In other words, overlooking the fact 
that the sets U,(0) (6 > (0) do not constitute a direction, suppose we say that f(x) 


— p asx — a in the new sense if, given any ¢ 7 0, there exists a number 6 7 0 
such that p(p, f(x)) < ¢ for all x © U,(a), i-e., for all x satisfying the inequality 
Polx, a) =. 

Prove that f(x) — p as x — a in this new sense if and only if f(a) = p and f(x) 
— pas x — ain the sense of Sec. 4.15a.f 
12. Let y(x) be a function defined on a set XY equipped with a direction S, taking 
values in a metric space Y, and let z(y) be a function defined on Y, taking values 
in a metric space Z. Then the “composite function” z(x) = z(y(x)) is defined on 


the set X and takes its values in the space Z. suppose the limits 
p=lim y(x)e Y, g=lim z(y) eZ 
s 


yp 


both exist. Does lim,z(x) necessarily exist? If so, does it equal p? 


13. With the same notation as in the preceding problem, prove that (a) If there 
exists a set A © S on which (x) does not take the value p, then z(x) has the limit 
q; (b) If there exists a set A © S on which y(x) is identically equal to p, then z(x) 
has the limit z(p); (c) If y(x) takes both values equal to p and values unequal to p 


on every set A © S, then z(x) has a limit if and only if g = z(p),¢ in which case 
lim z(x) =q. 
Ss 


+ The uniqueness of the limit p is proved in Theorem 4.22. 

t For brevity, we say “as n — oo” instead of “in the direction n — 00,” and similarly elsewhere. 

+ Obviously, one of every pair of sets U'5 {): U'5,(4) must contain the other (U's @ - U'5,() if 01 > 
02). 

t The same is true of the example of Sec. 4.13 (give the details). 


+ Given a function f(x) defined on a set G and a function (x) defined on a larger set E > G, suppose f(x) = 
g(x) for all x € G. Then g(x) is called an extension of f(x) from G to E, while f(x) is called the restriction of 
g(x), from E to G. 


+ Note that the completeness of M plays no role in the second part of the proof. 

+ So that, given any ¢ > 0, there exists a set A © § such that Ifcx)| <eforallxe A. 

{ Or simply lim f (x) = 00, 

+ For brevity, we often omit arguments of functions here and below. 

+ If g(x) = 1, 1.e., if g(x) is identically equal to 1, then (3) reduces to f(x) = 0(1), as in Sec. 4.32. If lim,g(x)= 
0, we describe (3) by saying that “f(x) approaches zero faster than g(x)” (in the direction 5). 

t If lim,|f(x)| = 00, we describe (4) by saying that “g(x) approaches infinity faster than f(x)” (in the direction 
S). 

+If¢& R-R, we set €—¢ = & while ify © R— R, we sety + =n. 

+ See e.g., G. M. Fichtenholz, The Definite Integral (translated by R. A. Silverman), Gordon and Breach, N. 
Y. (1973), Sec. 12. 

+ For simplicity, we often omit n — oo in the expressions for ¢ and y, as in Problem 2. 

+ In particular, f(a) must now be defined, unlike the situation in Sec. 4.15a. 

{ Thereby making p the limit of z(x) in the sense of Problem 11. 


5 Continuous Functions 


5.1. Continuous Functions on a Metric Space 


5.11. Let f(x) be a function defined on a metric space M equipped with a distance 
Po, taking values in a metric space P equipped with a distance p. As in Sec. 


4.15a, let x — a be the direction corresponding to the deleted neighborhoods 
U;(a) ={x € M: 0<po(x,a) <5} (ae M, d>0). 


a. Definition. The function f(x) is said to be continuous at x = a, and the point a 
is then called a continuity point of f(x), if 


lim f(x) =f(a). 


x~a 


In other words, f(x) is said to be continuous at x = a if, given any ¢ 7 0, there 
exists a 0 7 0 such that Pol, a) < 6 implies 


p(f(x), f(a) <e. (1) 


Naturally, for a numerical function f(x) the inequality (1) takes the form 


Ax) — fla)| <«. 
Note that every isolated point a © M is automatically a continuity point of f(x). 


b. By an obvious modification of Theorem 4.65, a function f(x) is continuous at 
x = aif and only if f(x,,) — fla) © P for every sequence x, a EM. 


c. Definition. A point a © M is called a discontinuity point of fix) if it is not a 
continuity point of fx). A function continuous at every point of a set E — M is 
said to be continuous on E. 


d. Naturally, the definition of continuity depends on the metrics of the spaces M 
and P. But since the definition can be formulated in terms of convergent 
sequences (Sec. 5.11b), the property of a function being continuous at a given 
point or on a given set is preserved if the metrics in MV and P are replaced by 
homeomorphic metrics (Sec. 3.34c). 


5.12. a. An obvious example of a continuous function with domain M and range 
P is the constant functiont 


fx) = Vo» 
where yg is a fixed point of the space P (cf. Theorem 4.21). 


b. As another example, consider the distance p(a, x) from a fixed point a © M to 
a variable point x © M. Clearly p(a, x) is a numerical function on the metric space 
M. The fact that p(a, x) is continuous at every point x = x9 of M follows from 


Theorem 3.32b. 
5.13. Continuous numerical functions 


a. THEOREM. Jf the numerical functions f(x) and g(x) are continuous at x = Xo, then 
so is their sum fix) + g(x). 


Proof. An immediate consequence of Theorem 4.36a. fi 


b. tHeorem. Jf the numerical functions f(x) and g(x) are continuous at x = Xo, then 
so is their product f(x) g(x). 


Proof. An immediate consequence of Theorem 4.36b. ff 

c. THEOREM. Jf the numerical functions f(x) and g(x) are continuous at x = Xo then 
so is their quotient f(x)/g(x), provided that g(xq) # 0. 

Proof. An immediate consequence of Theorem 4.36e. fi 


d. We are now in a position to construct wide classes of continuous functions. 
For example, the numerical function y = x defined on the real line R is obviously 
continuous on R. Hence it follows from the above theorems that every 
polynomial a@ox"+a,x""'+++»+a, is continuous on R while every rational 
function 


Agx" +a,x"~'4+-+--+a, 
box™ + bx" Pee +b, 


is continuous at every point x © R where its denominator is nonvanishing. 


e. Let y = ¢;, where ¢; is the Ath component of the vector x = (¢), ..., ¢,) = Bs 


Then y is obviously a continuous numerical function on R,. It follows from 


Theorems 5.13a—5.13c that every polynomial in the components of the vector x 
is continuous on R and that every rational function in the components of x is 
continuous at every point of R, where its denominator is nonvanishing. 


5.14. Let f(x) be a continuous function on a metric space M equipped with a 
distance fp, taking values in a metric space P equipped with a distance p. 


Moreover, let G be some subset of P, and let H = {x © M: fix) © G}. 
a. rHEorEM. Jf G is open then H is open. 


Proof. Given xo © H, flxo) © G, let 


V= {y= P: py, fixo))} Se 


be an open ball centered at /(xp) and contained in G. Let 6 > ( be such that Polx, 
Xo) <6 implies p(fx), fl%y)) < €. Then the ball 


U= {x © M: po(x, xo) <6} 

is contained in H. fl 

b. roeorem. [f G is closed, then H is closed. 

Proof. The complement of the set H relative to M is the open set {x © M: fix) © P 
— G}. But the latter set is open, by the preceding theorem, and hence H is closed. 
c. theorem. Jf f(x) is a continuous numerical function and c © R an arbitrary real 
number, then the sets 

{x © M: fc) = ch {x © M: f(x) 7 c} 

are open, while the sets 

{x © M: fc) S ch, {x © M: f(x) 2 ch, {x © M: f(x) = c} 

are closed. 

Proof. An immediate consequence of Theorems 5.14a and 5.14b. 


d. roeorem. [f f,(x), .... f(x) are continuous numerical functions and ay, ..., Ap; 


Byy ees Din (Ay B 15 «005 Gn ~ By) are arbitrary real numbers in R, then the set 


{xe M: a, <f, (x) <d,,.--5 @n <Sin(*) <5,,} (2) 
is open, while the set 

{xe M: ay Sf, (%) Sb 45.--5 OS Sm(%) S Om} (3) 
is closed. 


Proof. An immediate consequence of the theorems on intersection of open and 
closed sets (Theorems 3.22 and 3.54). li 


e. Using the preceding theorem, we see that many figures of elementary 
geometry, described by systems of equations like those in (2) and (3) are open or 
closed sets. In n-dimensional space, for example, a set of the form 


Ep. 
{x Ke a a ™ bi, esiag gs al 
is open, while a set of the form 
ice Rx ay Sx, MD is és: weeg Ay a Sp as 


is closed. The first set is called an open block, while the second is called a closed 
block. 
Similarly, the polygonal figure in the x,x,-plane described by m “linear 


inequalities” of the form 
ax, + bx, = cj = 1, ..., m) 


is an open set, called an open je while the figure described by the same 
inequalities with < replaced by © is a closed set, called a closed m-gon. In the 
same way, we can introduce open and closed polyhedra in n-dimensional space 
(n ~ 2) by using inequalities involving linear functions of the coordinates of a 
variable point x © R,,. Closed figures of this type are compacta, provided they are 


bounded (see Theorem 3.96e). 
5.15. Continuity of a composite function 


a. Given three metric spaces M, N, and P equipped with distances py, py, and p,, 


let y = f(x) be a function defined on M and taking values in N, while z = g(y) is a 
function defined on N and taking values in P. Then the “composite function” 


z=h(x)=g(f(x)) (4) 
is defined on M and takes its values in P. 


THEOREM. Suppose the function y = f(x) is continuous at x = Xo, while the function z 
= g(y) is continuous at y = yo = flXo). Then the composite function (4) is 
continuous at x = Xo. 


Proof. Given any ¢ > 0, we can find a t > 0 such that py{y, yo) < 7 implies 
pple), 2(%))* é, since 


lim g(_¥) =g(9o)- 


y~*¥o 


Having found r, we can then find a 6 7 0 such that p MX, Xo) < 6 implies 


prs ¥o) = Pn (Kx), fl%)) <1, 


since 


lim f(x) =f(%o). 


x~xXo 


But then p(x, Xo) <6 implies 


pplh(x), h(x9)) = p,(e(Kx)), efx) = ppg), B09) Se, 


so that 
lim h(x) =h(xo). J 


x~Xo 


b. rHeorem. Let f(x) be a continuous function on a metric space M taking values 
in another metric space N, and let b be a fixed point of N. Then 


A(x) = pr(flx), 6) 

is a continuous numerical function on M. 

Proof. An immediate consequence of Sec. 5.12b and the preceding theorem. & 
In particular, if f(x) is a continuous numerical function on a metric space M, then 
IAx)| = pfx), 9) 

is also continuous on M, since in this case N is just the real line R. Ol 


5.16. Continuous functions on compacta. Continuous functions on a compact 
metric space (Sec. 3.91a) have certain special properties which we consider here 
and in Sec. 5.17. 


a. THEorREM. The set of all values of a continuous function f (x) on a compact 
metric space M is itself compact. 


Proof. Given an arbitrary sequence f(x), .... A(x,), ... of values of f(x), use the 
compactness of M to choose a convergent subsequence x,,,, ..., X,;, --- from the 
sequence x), ..., X,, ... Suppose x,,; — a, say. Then f(x,,) — fla), since f(x) is 
continuous at x = a. Thus we have found a convergent subsequence of the 
original sequence /(x,), ..., f(x,,), .... thereby proving the compactness of the set 


(fx): x= My. A 


b. rneorem (Weierstrass). Every continuous numerical function fix) defined on a 
compact metric space M (with values in R) is bounded. Moreover, f(x) achieves 
its greatest lower and least upper bounds on M, i.e., if 


a=inff(x),  fB=sup/(x), 


xeM xeM 


then there exist points p and q in M such that f(p) = a, fig) = fp. 


Proof. An immediate consequence of the preceding theorem and the properties 
of a compact set on the real line (see Theorem 3.96f). 


c. Let the space M in Weierstrass’ theorem be the closed interval [a, b], and 
suppose f(a) = f(b). Then at least one of the points p and qg must be an interior 
point of [a, 5], 1.e., must lie in the open interval (a, b). In fact, if f(x) is a constant 
equal to f(a), we can choose any point of the interval (a, b) as the point p = q. 
Suppose, on the other hand, that f(x) is nonconstant, with values greater than 
fia), say. Then f = sup f(x) 7 f{a). By Weierstrass’ theorem, there exists a point 
gq © [a, b] such that fq) = 8. But f(a) = f(b) < B, and hence gq © (a, b), as claimed. 
If f(x) takes values less than f(a), then a similar argument shows that the point p 
such that f(p) = a = inf f(x) must belong to (a, 5). 


5.17. Uniform continuity 


a. Definition. A function f(x) defined on a metric space M equipped with a 
distance pp, taking values in a metric space P equipped with a distance p, is said 
to be uniformly continuous on M if, given any ¢ 7 0, there exists a 6 7 0 such 
that po(x’, x”) < 6 implies p(f(x’), fx") < ¢ for arbitrary points x', x" © M. 
Obviously, every uniformly continuous function on /M is continuous at every 
point of M. In general, however, continuity of f(x) on M does not imply uniform 
continuity of f(x} on M. For example, the function y = x* is continuous on the 


half-line 0 Sx < «0 (Sec. 5.13d). However, y = x? is not uniformly continuous on 
0 Sx <0, since 


x2 = x2 = (x' a x")(x! = x"), 


so that the condition |x’ — x”| < 6 certainly does not imply that |x’? — x"?| is 
bounded by any constant. 


b. The situation is different in the case where M is compact, as shown by the 
following 


THEOREM (Heine). /f f(x) is continuous on M and if M is compact, then fix) is 
uniformly continuous on M. 


Proof. Suppose to the contrary that f(x) is not uniformly continuous on M. Then, 
given any ¢ 7 0 and any 6 ~ 0, there is a pair of points x’, x” © M such that Pox’, 
x") <6 but p(x’), fic") 2 e. Thus, given any ¢7 0 and any n = 1, 2, ..., there is a 
pair of points *n *»©M such that po(*w *n) S 1//n but 


P( F(a)» S(¥n)) 28 (9) 


(choose 6 = 1/n). Since M is compact, the sequence *, contains a convergent 
, . U ” 
subsequence *n,*o © M. Moreover, since P(*n *m) = 1/n;, we have 


P(Xn,> Xo) < p(xn,> Xh,) + p(x, Xo) —0, 


so that *m—*o as well. The function f(x) is continuous at Xo, and hence, given 
any «7 0, there is a dy 7 0 such that po(x, x9) < dp implies 


p(flx), flxo)) < €2. 


Therefore 

PUL (Xn): I(%0)) <8/2,  p(S(%n,) S(¥0)) < 8/2 

for all sufficiently large k, since *n.*o> *»,?*o But then 
PUL (% ny) S(%n)) SPOS (%ny) S(%0)) + PS %n,)»f(%0)) <e 


for all sufficiently large k, contrary to (5). This contradiction shows that f(x) is 
uniformly continuous on M. 


c. Modulus of continuity. Again let f(x) be a function with domain M and range 
P, where M and P are metric spaces equipped with distances po and p, 


respectively. Consider the function 


@,(d) = sup p( f(x’), A(x”), 
(x’,x")<d 


Po 


whose argument is the positive number 0. Clearly, f(x) is uniformly continuous 
on M if and only if 


lim w,(8) =0. 
6-0 


The function w{0) is called the modulus of continuity of f(x) on the space M. 


d. For numerical functions. y = f(x), 1.e., for functions taking values on the real 
line R equipped with the usual metric pv”, y") = |v’ — y"|, the function wd) has 
the following properties: 

(1) of; (0) S@f0) + @,(0); 

(2) @.f0) = cw, (0) for constant c > 0; 

(3) s_(9) <p | f(x) |-@,(5) +@,(5) ie [g(x)I- 


The first two properties are an immediate consequence of the definition of the 
modulus of continuity, while the third property follows from the inequality 


IF (x’)a(*’) fF (&")a(*")| 
=|f(x')a(x’) Sx’) a(x”) +£(%') 2 (x") F(x") a(x") 
<|f(’)| lex’) — 9%") +S’) —S(%")| lee") 
<sup I F(x’) le (*") —2(®")1 + 1A’) —S(%")| Les lg(*")|. 


5.18. Continuous functions of two variables. In analysis, one often encounters 
continuous functions of two (or more) variables. Let M/, and M, be two metric 


spaces, equipped with distances p, and p,. Then a function f(x,, x) with 
arguments x, © M,, x, © M, taking values in a metric space P equipped with a 
distance p,t is said to be continuous in x, and x» (jointly) at *, = x}, *, = x9 if, 
given any ¢7 0, there exists 6 7 0 such that implies p( f(x;, x2), /(*?, x9)) <elt 
is easy to see that this definition of continuity is essentially nothing new, and in 
fact agrees with the definition of continuity of the function /(x,, x») regarded as a 


function f(x) of a single argument x = (x), x>), namely a variable point of the 
direct product M = M, x M, of the metric spaces M, and M,. To see this, we 
recall from Sec. 3.16 that a metric pp) can be introduced in the space M = M, x 
M, by setting 


P1(*4,*2) <6, P2(*25 x3) <6 (6) 
Po(*,9) =max {p41 (*1,91), P2(¥2.I2)}, (7) 


where x = (x1, X>), ¥), ¥ = (V1, ¥2). Thus continuity of the function f(x) at the point 
x° =(x?, x}) means that given any ¢ > (), there exists ad 7 0 such that 

Po(x, x°) =max {p, (x1, *})s P2(*25 *2)} <d (8) 
implies p(f(x), fx°)) < ¢. But the inequality (8) is equivalent to the pair of 
inequalities (6), and hence the two definitions of continuity are equivalent. 

In Sec. 3.16 we indicated two other ways of defining a metric on the direct 
product M= WM, x M). But the alternative metrics are both homeomorphic to the 
metric (7), as shown in Sec. 3.34d. Hence, by Sec. 5.11d, all three metrics lead 
to precisely the same set of continuous functions on M = M, x M, (with values 
in P). 


5.2. Continuous Numerical Functions on the Real Line 


5.21. a. In Secs. 5.2—5.7 we will consider numerical functions f(x) with domain 
E = R and values in R, where R is the extended real line equipped with the 
metric r(x, v) of Theorem 3.35d. The metric r(x, y) is homeomorphic to the usual 
metric p(x, y) = |x — y| on the ordinary real line R. Hence, in problems involving 
the continuity of f(x) at a point x9, we can replace r by p on_X if Xp 1s finite and 


on Y if f(x) 1s finite. 


b. To illustrate these remarks, consider the rational function 


Ax" +a x" 14+ +4, 


= 0, b5 #0 
J (*) box™ 45,4 +b, (49 #0, by #0), 


defined at every point of R where the denominator is nonvanishing. Suppose we 
complete the definition of f(x) at the points +00 by setting 


f(-©) = a f(x),  f(+00)= lim f(x) 


x~ +0 


(the existence of these limits in R was proved in Sec. 4.39d). Then the resulting 
function, with values in R, is continuous at the points +00 as well. 


c. Lemma. [fa numerical function f(x) with values in R is continuous at a point x = 
a © R, then, given any ¢ 7 0, there exists an open ball Us centered at a, i.e., a set 


of the form 

{x: |x —a| <5} tfa is finite, 
{x:x>1/d} ifa=+00, 
{x: x< —1/5} ifa=—oo 


(6 7 0) such that, for all x © Us, 


If (*)-fla)l< & if fla) ts finite, 
f(x) > lfeffla)=+o, 
f(x) < — lle f(a) = — 0. 


In particular, Us; can be chosen in such a way that f(x). has the same sign as f(a) 
at every point x © U;s (provided that fia) # 0). 


Proof. An immediate consequence of Theorems 4.34a and 4.34b, the definition 
of the metric in R (in particular, see Sec. 3.35e), and the definition of continuity. 


5.22. rHeorem (Bolzano). Suppose a numerical function fix) is continuous on a 
closed interval [a, b] and takes values with opposite signs at the end points of [a, 
b]. Then there exists a point c © (a, b)} at which fic) = 0. 


Proof. Let fia) < 0, f(b) 7 0, say. Then, by the above lemma, f(x) < 0 holds for 
all x sufficiently near a, while f(x) 7 0 holds for all x sufficiently near b. Hence 
the point 


c= sup {x © [a, b]: ft) <0} 


is distinct from both a and b. By the definition of the least upper bound, f{x’) 2 0 
for x! 7 c, while for every 5 7 0 there is an x’ 7 c — 6, i.e., every neighborhood of 
c contains points x’ and x" such that f(x’) 2 0, fx”) < 0. But this is impossible if 


fic) # 0, by the lemma again. It follows that (c)=0.t I 


5.23. rHeorem (Intermediate value theorem). Suppose a numerical function f(x) 
is continuous on a Closed interval [a, b| and takes distinct values A = f(a), B = 
fib) at the end points of |a, b]. Then, given any number C between A and B, there 
exists a point c © (a, b) at which fic) =C. 


Proof. The function f(x) — C satisfies the conditions of Bolzano’s theorem, and 
hence vanishes at some point c © (a, b). 


5.24. One-sided continuity. We now define two further directions on the set R 
of all finite real numbers, besides the direction x — a of Sec. 4.15b, consisting of 
all deleted neighborhoods 


Us ={x: 0<|x—al <5} (5>0). 


The first of these directions, denoted by x 7 a, consists of all intervals of the 
form 


Uj={x: 0<a—x<5} (6>0) 

(1 for “left”), while the second, denoted by x “s a, consists of all intervals of the 
form 

U5; ={x: 0<x-—a<6} (6>0) 

(r for “right”).+ If a function f(x) has a limit p in the direction x 7 a, we write 


p=lim f(x) =f(a—9), 


x/a 
while if f(x) has a limit p in the direction x ‘sa, we write 


p=lim f(x) = f(a+0). 
xNa 
Note that these limits are meaningful even if f(x) fails to be defined at the point x 
= a itself (cf. Sec. 4.15a). 
Now let S be the direction x — a, and let 


G= {x:x Sa},H= {xx 7 a}. 


Then, in the notation of Sec. 4.16, the direction x 7 a is just GS, while the 
direction x ‘a is just HS. Thus if lim, _, , f(x) exists and equals p, then lim, 7 , 
f(x) and lim, s , f(x) both exist and equal p, as in Sec. 4.16a. However, the 
existence of the “one-sided limits” lim, 7, f(x) and lim, s , f(x) does not imply 
the existence of lim, _ , fx) unless the two limits are equal, to p say, in which 
case lim, _, , f(x) exists and equals p, by Theorem 4.1 6c. 


Suppose now that f(x) is defined at the point x = a. Then, according to Sec. 
5.lla, we say that f(x) is continuous at x = a if 


lim f(x) =/(). 


xa 
In the same way, we say that f(x) is continuous from the left at x = a if 


lim f(x) =/(@) 


x/a 
and continuous from the right at x = a if 


lim f(x) =f(@). 

xNa 

Suppose f(x) is continuous both from the left and from the right at x = a. Then 
clearly f(x) is continuous at x = a. 


5.3. Monotonie Functions 


5.31. Definition. Let f(x) be a numerical function defined on a set E — R, taking 
values in R. Then f(x) is said to be increasing on E if x <y (x, y © E) implies f(x) 
< fly), nondecreasing on E if x < y (x, y © E) implies f(x) S fly), decreasing on E 
if x Sy (x, y © E) implies fx) 7 fly), and nonincreasing on E if x < y (x, y © E) 
implies f(x) 2 f(y). In any of these four cases, f(x) is said to be monotonic on E. 
Obviously, if an increasing (or decreasing) function takes a value C, then it 
can take this value at only one point of £. On the other hand, a nonincreasing or 
nondecreasing function can take the same value at several points of E. 


5.32. Let f(x) be nondecreasing on a closed interval E = [a, b], and let xq ©Ebea 
point distinct from a. Then f(x) is nondecreasing in the direction x“ Xg, In the 
sense of Sec. 4.51. Since f(x) is bounded from above, by /(xq) say, it follows 


from Theorem 4.53 that the limit 
Ff (%o —0) = lim f(x) 


x7Xo 


exists and equals 

f(x — 0) = sup {f(x): x i: x= EY. 

Similarly, if xq © EF is a point distinct from 5, it follows from Theorem 4.54 that 
the limit 

fly +0) = lim f(x) 


xN\Xo 


exists and equals 
f(xo + 0) = inf {f(x): x " Nos X [ps 
At the points a and 5, we set 


Ka—- 0) =fla), Ab + 0)= flO), 
by definition. 


5.33. Again let f(x) be nondecreasing on a closed interval E = [a, b]. Since f(x") S 
fXxo) S fix") ifx' S Xo Sx", it follows from Theorem 1.62b that 


fxg — 0) Sflxo) Sfx + 0). 


By Sec. 5.24, the function f(x) 1s continuous from the left at x = xo if f(x — 0) = 
f(xo) and continuous from the right at x = xo if f(xp + 0) = (xo). The condition f(x9 
— 0) = fixp + 9) (which implies f(xo — 0) = fixo) = Axo + 0)) is necessary and 
sufficient for the nondecreasing function f(x) to be continuous at x = Xo. 

If f(xy — 0) <flxq + 0), then x9 is certainly a discontinuity point of the function 
f(x). The values of f(x) cannot exceed f(x, — 0) to the left of x9 and cannot be less 
than f(x, + 0) to the right of x9, while the value of f(x,) must lie in the interval 


I= [flxy — 0), flag + 0D]. 
It follows that only one number in the interval / can be a value of f(x). 


5.34. The above observation leads to the following sufficient condition for 
continuity of a nondecreasing function: 


THEOREM. Jf f(x) is nondecreasing on the interval [a, b| and takes every value in 
the interval [f(a), f(b)], then f(x) is continuous on [a, b]. 


Proof. Suppose f(xp — 0) < {xo + 0) at some point x9 © fa, b]. Then f(x) cannot 
take every value in [f(a), f(b)]. Hence fixp — 0) = f(x— + 0) for every xo = ia. 5], 
i.e. f(x) is continuous on [a, b]. ft 

The condition that f(x) take every value in the interval [f(a), f(b)] is also 


necessary, by the intermediate value theorem (Theorem 5.23). 
As an exercise, the reader should state and prove the analogues of the results 


of Secs. 5.32—5.34 for nondecreasing functions. 


5.35. The inverse function 


a. Definition. Let f(x) be a function with domain E and range F = {y: y = f(x), x © 
E). Then we say that f(x) has an inverse (on E) if there exists a (single-valued) 
function g(y) with domain F such that g(f(x)) = x for all x © E. Clearly, f(x) has 
an inverse on F if and only if f(x) is a one-to-one function on FE. The function 
g(y), called the inverse (function) of f(x), is then unique and satisfies the 
condition f(g(y)) = y for all y © F. 


b. tHeorem. Let the numerical function f(x) be continuous and increasing on a 
closed interval [a, b] — R, and let 


A= fla),B =f(b)(A, BE R). 


Then f(x) has an inverse o(y) which is continuous and increasing on the closed 
interval [A, B]. 


Proof. Let g(a) = A, o(B) = b by definition. By the intermediate value theorem 
(Sec. 5.23), given any C such that A < C < B, there is a point c © (a, b) such that 
fic) = C. Moreover c is unique (see Sec. 5.31). Let g(C) = c for all such C. The 
function g(y) is now defined on the whole interval [/(a), f(b)], and is obviously 


the inverse of f(x) since g(f(x)) = x by construction. Moreover, y, < yy implies x, 
< Xy, 1.€., P(V}) < @(y>), So that g(y) is increasing. The continuity of g() follows 
from Theorem 5.34, since g(y) takes all values in the interval [a,b]. 


c. For the case of an open interval, this theorem takes the following form: 


THEOREM. Let the numerical function f(x) be continuous and increasing on an open 
interval (a, b), and let 
A=lim f(x), B=limf(x) (A, BeR). 

x\a x/7b 


Then f(x) has an inverse g(y) which is continuous and increasing on the open 
interval (A, B). 


Proof. Extend the definition of f(x) to the closed interval [a, b] by setting f(a) = 
A, f(b) = B. Then f(x) is continuous and increasing on [a, b]. It follows from the 
preceding theorem that f(x) has an inverse g(y) which is continuous and 


increasing on [A, B]. To complete the proof, we need only restrict f(x) to the 
open interval, thereby restricting (y) to (A, B). 


As an exercise, the reader should state and prove the analogues of Theorems 
5.35b and 5.35c for decreasing functions. 


d. The graph of an inverse function. Let y = f(x) be a one-to-one function on a 
closed interval [a, b]. Then, as shown in Figure 2, the function g(y) has the graph 
obtained from the graph of f(x) by reflecting the latter in the line y = x (the 
bisector of the first quadrant). In fact, the reflection simply carries the point (x, y 


= f(x)) into the point (y, x = g(y)). 


Figure 2 


5.4. The Logarithm 


5.41. rnrorem. There exists a unique function f(x) (x 7 0) satisfying the following 
conditions: 

(a) fixy) = fx) + fly) for all positive x and y; 

(b) fla) = 1 for a givena™ 1; 

(c) f(x) is an increasing function, i.e., x < y implies fix) < fly). 


Proof (after J. Dieudonné). First we construct f(x), assuming that f(x) actually 
exists. It follows from Condition a that f(1) = 2f(1), and hence f(1) = 0. Then 
Conditions a and b imply 


fla") = nfla) = n(n = 1, 2, ...). 
By Theorem 1.72, given any n = 1, 2, ..., there is an integer m such that 
ge <a". (1) 
Since f(x) is increasing, (1) implies 
fla”) =m S fix") = nflx) Sfla"* y= m+ 1. 
Therefore, given any n = 1, 2, ..., there is an integer m such that 


n n 


In particular, choosing 
n, = 1,1) =2, ..., mp = 2}, ..., 


we can find an integer m, such that 


am <x" <q™t! (k=1,2,...), (2) 


thereby determining an interval 


mM, Mtl) (3) 
ny ‘ ny 


The interval (3) contains the next interval 


mae 1 Mri + | 
Mei = +1 
In fact, (2) implies 


qaink S 20k _— xk + 1< qziink + 1). 
dh 72 +18$2(m,+1 
and hence m, 4 | 7 2m;,, my + 1 S2(m, + 1), 


My 2M, Mer merit I < (mt 1) —™t! 


Mm 2h Nyy M+ 2n, Ny 


Thus the set of all intervals (3) with & = 1, 2, ... is a system of nested closed 
intervals. The intersection of all the intervals of this system is nonempty, by 
Theorem 1.81, and in fact consists of a single point, by Theorem 1.82, since the 
length of the Ath interval approaches 0 as k — oo. We now choose the value of 
f(x) to be the number corresponding to this unique point. Since the point is 
unique, so is the resulting function f(x) (x 7 0). 

Next we show that the function f(x) just constructed satisfies Conditions a—c. 
Given any positive numbers x and y, let m; and p;, be integers such that 


gink & ynk S qmk+ 1 Pk Syk< ghkt 1 


,a 


so that 


tc. Pg 


Ny 


< f(*) <— 


mee 


atl yo ht 


fet) 


se 


(4) 


Then 


qink + Pk yes qink + Pk+ 2. 


S (xy 


which implies 


Mt Pr — ST xy) gm tht? 
Ny k 
On the other hand, it follows from (4) that 


MPR < F(x) + f(y) gmtht2 
ny ny 


and hence 


Lflz) + SO) — fly)| <=. 


Since n, can be made arbitrarily large, this immediately implies Condition a, by 
Theorem 1.57. The fact that f(a) = 1 is obvious from the construction of f(x). 


Moreover, given any x 7 1, there is an integer n such that x 


a, and hence, by 


the definition of f(x), such that fix) 2 1 /n. In particular, fx) 7 0 for any x 7 1. 


But then f(xy) = f(x) + fy) 7 fx) for every y 7 1, so that f(x) is increasing. 


5.42. The function f(x) figuring in Theorem 5.41 is called the logarithm of x to 
the base a and is denoted by log, x. In this notation, Conditions a—c of Theorem 


5.41 take the form 
log, (xy) =log, x +log,y, 
log, a=1, 
log, x<log, y ifx<y. 

In particular, (5) implies 


log, x"=n log, x (n= 1,2,...). 


Choosing x = 1, n = 2 in (8), we get log, 1 =2 log, 1 and hence 


log, 1=0. 


Moreover, since 


log, ep =log, 1=log, x+log, Smith 
= ¥ 


we have 


log, —_ log, x. 
x 


It follows that 
log, ~=log, x—log, y. (10) 
J 
In particular, (8) holds for any integer n (not necessarily positive). 
5.43. rueorem. The function log x (x > (0) is continuous. 
Proof. Given any ¢ 7 0, choose n 1 /e and let h= ore. Since h” = a, we have 


n log, h= log, h" = log, a= 1, 


and therefore 


Hence the inequality 
1 l 
—e< ——<log, x<-<é 
n n 


holds in the interval 1/h <x Sh, thereby proving the continuity of log,x at x = 1. 
Next, given any Vo 7 0, lety= XVo, where 1/h <x Sh. Then, since 


logy = log,x + logy, 


we have 
llog,y — log,v0| = |log,x| <«. 


Since the interval (¥7/h, hyp) contains an interval of the form (vv — 5, vo + 9), it 
follows that log x is continuous atx=yo. I 


5.44. We now establish the relation between logarithms to different bases a and 
b. The function 


_ log, x 


a log, 6 


obviously satisfies Conditions a and c of Theorem 5.41, and moreover 


Hence, by the uniqueness of the logarithm, we have f(x) = log,x, thereby proving 
the formula 

log, x =log, 5: log, x (11) 
for arbitrary a 7 1,b7 1,x7 0. 
5.45. Finally we note that 


lim log, x=00, lim log, x= —oo. (12) 
x7 @ x0 


The first formula follows from Theorem 4.55 and the fact that log x is increasing 
and unbounded from above (since, for example, log,a” =n for arbitrary n) as x 
— oo. The second formula is an immediate consequence of the first, since 


lim log, x =lim log, Pas —lim log, y= — ©. 
x0 yo Ps y+ @ 


Using (12), we can complete the definition of logx by setting 
log, 0 =— 0, log, 0 = 00. 


Then the function log x (with values in R) is continuous on the interval 0 <x S 


5.5. The Exponential 


5.51. According to Theorem 5.35c, the continuous increasing function logx has 


an inverse function, which we denote by a* and call the exponential (to the base 
a).t The exponential is defined and positive for all real x (see Sec. 5.45), and 
clearly satisfies the identities 


ames, (1) 


log, a* =x. (2) 


Comparing the formula 


log, (a---a) =log, a+---+log, a=n 
n times n times 


with (2), we see that for n = 1, 2, ... the function a* reduces to 


as previously defined in Sec. 1.46. Choosing x = 1/n in formula (2), we get 


log, a'/* =-, 


which implies 


n log’, a\" = log, (a!)" = 1, 


so that 
(al!")n =a. 
i.e., a'/” is the nth root of a as defined in Theorem 1.63. 


5.52. The exponential y = a* is increasing and continuous for all x, again by 
Theorem 5.35c. Moreover 


— 
oo 
—— 


a**? wa*a’, 


(a)? =a” 


GS 
— 


for arbitrary real x and y. In fact, ifx = log, & y= log, n. then 


a’ ry = q'8a §+ logan = q'8a on = cn = aa, 


while 
(a*y’ = (q' Baby = gylobaS = Qi = g, 
Replacing x by b* in formula (11), p. 148, we get 
log, 6*=log, 6: log, b* =x log, b. (5) 


The same formula (11) can be used to extend the definition of the logarithm and 
exponential to the ease where the base b lies in the interval (0, 1). In fact, let b = 
1/a where a7 1, and then set 


log, x 


log, 6 


log, x= = —log, x= — logy jp x 


by definition. The resulting function log,x decreases as x increases and clearly 
satisfies the conditions 


log, (xy) = log, x + logyy, 


log, b= 1. 
Moreover. 
b*=— ‘ ma *, 


and hence 


p'o8vX = gq lo8x = glogux = x 
log, b*=—log,a*=x, 
&b La 


so that formulas (1) and (2), as well as their implications (3) and (4), continue to 
hold for b © (0, 1). 


5.53. By interchanging the roles of a and x in the exponential y = a*, we get the 
closely related function 
y=x%(r 0) 
(“x raised to an arbitrary real power a’’). In terms of exponentials and logarithms, 
we have the formula 

YH = (p18 *)4 = pe oeo* (b>1), (6) 
which, together with Theorem 5.15a, immediately implies the continuity of y = 
x", Replacing x by pq (p, g 7 0) in (6), we get 
(p!°8e paya — (p1o8» pt+logs a)¢ = logy p+a logy q 


= (Wf 7) (ystems) = (plese rye (louse, 


and hence 
(£9)" =p*q". (7) 


5.54. tHEorem. The functions a‘, log, x, and x* have the follow in “/imiting 


properties”: 

lim a oe) ifa>1, 8 

im @= 

x7 @ 0 ifa<l, ( ) 

lim a" 0 ifa> 1, 9 

im a*= 

ea oe) ifa<l, (9) 

lim 1 co. 6«oifa>l, 10 

im -- 

eseege Ba * —o ifa<l, (10) 

—o ifa>l, 

lim | = 11 

“ee ‘oe ifa<l, ey 
00 if a>0, 

lim x*= 

ete ' if a<0, (12) 
0 if a>0, 

lim #¢=| a (13) 

x\0 ve) ifa<0. 


Proof. The first formula of the pair (8) follows from the fact that a* (a 7 1) is 
increasing and unbounded as x — oo (the values of a* fill the domain of 
definition of log, x, i.e., the whole half-line 0 <x S00), and the second formula 


is then obtained from the first by replacing a by 1/a. To get the formulas (9) we 
need only replace x by — x in (8). The first of the formulas (10) was proved in 
Sec. 5.45, and the second is obtained from the first by using the definition of 
log, x for a < 1 (Sec. 5.52). Replacing x by 1/x in (10) then gives (11). To get 


(12) and (13), we use (8)-(11) and the fact that x7 =p%2*, [fj 


Theorem 5.54 helps us draw the graphs of the functions a’, log, x, and x, as 
shown in Figures 3-5. 
Using (8) and (9), we can complete the definition of a* by setting 


a” =0, at =o 


if a7 1. The function a* (with values in R) is then continuous on R. The same is 
true if a < 1, provided we set 


o- =n Ht); 


5.55. emma. Leta? 1,—0 <r <0. Then 


Hin Sexo, (14) 


no Nt 


Proof. Let b be a number in the interval (1, a). Using (7), we have 


(15) 


Figure 4 


Figure 5 


By the continuity of of x” at the point x = 1, the mght-hand side of (15) 


approaches a as n — o, and hence exceeds b for all n starting with some integer 
N. Therefore 


>—), 
(N+1)"* NP 
o>! b> f 
(N+2)’> (N41) ~ N ’ 
aNt* - 2 


But 


lim b* = 00, 
k->@ 


by (8), which immediately implies (14). li 


5.56. theorem. Let a7 l,p 7 0,0 <r S00, Then 


x 


lim =o, (16) 
. log, x 
a (17) 
lim x? log, x=0, (18) 
x70 
lim x” = 1, (19) 
x-0 
lim x'/*? = 1. (20) 
x~o 


Proof. Given any x 7 0, let n be such that n Sx ‘<n +1. Then 


a* a" | a"* 1 


x (n+1)’ a (n+1)” 


which gives (16) after using the Lemma. To get (17), we change x to log in (16) 


and raise both sides to the power p = 1/r, assuming that r 7 0. Replacing x by 1 
/x, we get (18). Finally (19) is obtained by taking powers, while (20) is obtained 
from (19) by replacing x by I/x. 


According to (16) and (17), as x — 0 the exponential a* (a 7 1) “grows more 
rapidly” than any power of x while the logarithm log,x (a > 1) “grows more 


slowly” than any positive power of x. 


5.57. If a function f(x) approaches a limit as x — , the sequence f(n) (n = 1, 2, 
...) approaches the same limit as n — 0 (see Sec. 4.65). This observation leads to 
the following limit formulas involving sequences: 


‘ ifa>1l, 


bina 
wes 0 ifa<l, 


no 


(21) 


co ifa>l 
. = > 99 
rmene{e ith a 
00 ifa>0 
: "a i 23 
ars ‘o ifa<0, vo 
lim 5 =o (a>1, -w<r<o) (24) 
lim 106." _9 (a>1, p>0), (25) 
ae 
lim 2/n=1. (26) 


Formulas (21)-(25) follow from (8), (10), (12), (16), and (17), while (26) 
follows from (20) with p = 1. 


5.58. The exponential and the number e 


a. First we recall from Sec. 4.63d that 


lim (1 +=) =e=2.71828... (27) 
n 


n~ co 


THEOREM. The function 
fe)=(1+2) (x #0) 


approaches e as x > ©. 


Proof. This is not an immediate consequence of (27), but can be proved by using 
the special character of f(x).+ Given any x 7 1, let n = n(x) be the integer such 
that n Sx <n+1.Thenn— as x > , and 


x nt+1 n 
(1+2) <(1+2) =(1+2) (1+2), 
x n n n 
x n n+l 
Cpe Le 
x n+ n+ 1+ 


n+1 


The right-hand sides of both inequalities approach the same limit e as x —> . 


Hence, given any « 7 0, there is an integer N such that both right-hand sides 
differ from e by less than ¢ for all n 7 N. But then f(x) differs from e by less than 
2e for allx 2 N+ 1, ice., 


_ (: +2)'=« 1 (28) 


tie (: +2)'me (29) 
x 


=(4 KY (44) (30) 


as y — 0, by (28). Replacing y by — x, we get (29).+ In particular, (30) implies 


n> ao n é 
a result which can easily be deduced directly from (27). 


C. THEOREM. 


Hin ee oe (31) 


z0 = 
Proof. Noting that 
ete) C +2) — tog, (1+2)!/ 


and replacing z by I/x, we get 


z-0 


fi eee) lim logs(1+) =log, ¢, 
z x++t0 x 


where in the last step we use the continuity of the logarithm, together with (28) 
and (29). 


Formula (31) becomes particularly simple if we choose the number e as the 
base a. Logarithms to the base e are called natural logarithms and are 


denoted by 
log.x = In x. 


For natural logarithms, (31) reduces to 


tim 2 +2) oy, (32) 


z-0 z 


In terms of the asymptotic unit FE (see Sec. 4.39), we can write (32) in the form 


In (1+z)=zE (E-+1 as z-0). (33) 
d. THEOREM. 
a— 1 1 
lim = (34) 
ns h log, é 
Proof. Let 
aclinee: 


x 


Then 


suiae h=log, (1 7) 
x x 


es SSO" — 


Now use the continuity of the logarithm, together with (28) and (29). 


Choosing a = e in (34), we get 
ie | 


liim—— = 1, 
noo A 


(3 


or equivalently 
é’=1+Eh (E-1ash-0) (36) 
in terms of the asymptotic unit E. 


e. Although the number e does not appear explicitly in the statement of the next 
theorem, it would be hard to prove the theorem without making tacit use of 
limits involving e. 


THEOREM. 
lim 4" =p (—w<p<o). (37) 


Proof. Using (33) and (36), we get in turn 


In(1+z)=zE,, 1l+z=e*!, 
(l+z)?=e*'=1+pzE,E,=1+pzk,, (38) 


which is equivalent to (37). li 


It is clear from (38) that Theorem 4.39, previously proved for p = 1, 2, ..., is 
valid for arbitrary real p. 


f. Choosing p = —1 in (38), we get 
— =1-—Z2E. (39) 


More generally, we have the useful formula 


1+azE, 
1+ fzE, 


where E, E>, E — 1 as z — 0. In ace, by (39), 


=l+(«—f)zE (248), (40) 


1+azE, 
] + BzE, 


=(1+azE,)(1—BzE,E;) 


= ] +azE, — BzE,E, —aBz?E,E,E, 
= l + z(aE, —BE,E, —aBzE,E,E;). 


But this implies (40), since the expression in parentheses on the right obviously 


approaches a — B as z — 0. 


5.6. Trigonometric Functions 


5.61. In elementary trigonometry the functions sin x and cos x are defined 
geometrically, and it is then shown that they satisfy the formulas 


sin? x+cos* x=1, (1) 

sin (x+y) =sin x cos y+ Cos x sin y, (2) 

cos (x +y) =cas x cos y—sin x sin y, (3) 
sin x 


O0<sinx<x< 


(4) 


’ 
COS x 


where (4) holds for all sufficiently small positive x, say for 0 < x < g This 


approach will not be used here, since we wish to introduce sin x and cos x purely 
analytically, starting from the properties of the real numbers, without recourse to 
extraneous geometrical considerations. Instead we define sin x and cos x as the 
functions satisfying (1)-(4) for all real x. Thus, strictly speaking, the 
considerations that follow are provisional, and should read typically “If there 
exist functions sin x and cos x satisfying (1)-(4), then...” However, this 
circumlocution is hardly necessary, since the existence and uniqueness of such 
functions sin x and cos x will in fact be proved at a later stage of the game (in 
Secs. 8.54 and 8.66—8.68). 


5.62. Setting x = y = 0 in (1)-(3), we get 


sin? 0 +cos? 0=1, (5) 
sin 0 =2 sin 0 cos 0, (6) 
cos 0=cos? 0—sin? 0. (7) 


It follows from (6) that either sin 0 = 0 or cos 0 = 4. But (5) and (7) imply 
1 +cos 0=2 cos? 0, (8) 
from which it is apparent that cos 0 = + is impossible. Therefore sin 0 = 0, and 


(5) then implies cos 0 = + 1. Formula (8) shows that of these two values, only 
cos 0 = | is possible. Thus, finally, 


sinO=0, cosO=1. (9) 


5.63. Next we replace y by — x in (2) and (3), afterwards using (9). This gives 


0 =sin x cos (— x) + cos x sin (— x), 
1 =cos x cos (— x) — sin x sin (— x). 


Solving this linear system for sin (— x) and cos (— x), we get 


sin (—x)=-—sinx, cos (—x)=cos x. (10) 


Suppose now that from (2) we subtract the same equation with y replaced by —y, 
and then use (10). The result is 


sin (x + y) — sin (x — y) =2 cos x sin y. 
Doing the same with (3), we get 
cos (x + y) — cos (x — y) =— 2 sin x siny. 


Replacing x + y by a and x — y by f, so that 


_at+p _a—B 
— a | 
we can write the last two formulas as 
sin a—sin B=2 cos 2B sin =F, (11) 
cos “—cos p= —2sin24P sin 2 (12) 


5.64. a. rneorem. The functions sin x and cos x are continuous for all x. 


Proof. First we note that (1) implies 
lsinx|<1, |cosx|<1 (13) 


for all x. Ifx7 y and if x — y is sufficiently small, then, by (4), (11), and (13), 


x+y 


Hs 
cos —— 
2 


|sin x —sin_»| =2 sin <2 sin? <2>—2 =x-y, 


2 2 


which proves the continuity of sin x (why?). Similarly, by (4), (12), and (13), 


|cos x —cos | =2|sin a <2 sin" <x—y, 


which proves the continuity of cosx. ff 


Db. THEOREM. 


ia ol (14) 


x70 * 
Proof. By (4), 


l sin x 
— < 


COS x x 


<i 


if0<x< €y, where the left and right-hand sides both approach 1 as x 40. Hence, 
by Theorem 4.34h, 


sin x 


lim =|], 


xo * 


But, because of (10), 


. sinx . sin (—x) 
lim = lim ——= 
x70 * (-x\o —* 


lL. 


Combining the last two formulas, we get (14). fl 


5.65. a. Next we show that the function y = cos x vanishes at certain points x 7 0. 


Let 


B= inf cos x, 
x>0 


and suppose f 7 0. Then, by (11), 
sin (m+1)x—sin mx>2B sins >0 


for sufficiently small x and all m = 1, 2, ..., which implies that sin mx increases 
without limit as m increases, contrary to (13). This contradiction shows that / = 
0. There must now be a point x9 at which cos xp = 0. In fact, if cos x 7 0 for all x 
2 (0), then (11) implies that sin x is increasing and hence, by (1), that cos x is 
decreasing. Since f = 0, it follows that there is a point z,) such that x 7 zp implies 


cos x< : 
13 


and hence 
sin x =,/1—cos? x> La 
13 
But then, settings y =x 7 Zo In (2), we get 
12 sin 2x =2 sin x cos x<2:] Zu 


13 13. 13° 


which is impossible. Therefore we cannot have cos x > 0 for all x 2 0, i.e., cos x 
vanishes at some point xo. 


b. Now let 
5 inf {x>0: cos x=0}. (15) 


The number 2/180 is called a degree, so that m/2 = 90 degrees (written 90°). It 
follows from (15) and the continuity of cos x that 


cos 5 =0, sin5 =I. (16) 


Moreover, sin x is increasing on the interval [0, 2/2], by (11), and hence cos x is 
decreasing on [0, 2/2]. Using (2), (3), and (16), we get the formulas 


; ™ , ™ 2 
sin (+5) =sin xcos5 +-cos x sin =cos (17) 


cos (x+~) =cos x cos ~ —sin x sin= = —sin x (18) 
2 2 2 


which in turn imply 


sin (x +7) =cos (x+5)=~sin i, (19) 

cos (x +7) = —sin (x+ 5) = —COS x. (20) 
c. Using (19) and (20), we see at once that 

sin (x +2) = —sin (x +7) =sin x, (21) 

cos (x +22) = —cos (x +7) =cos x. (22) 


A function f(x) such that 


x + T) = flx) 


for all x and some 7' is said to be periodic, with period T. Thus (21) and (22) 
show that the functions sin x and cos x are periodic, with period 22. The 
behavior of the signs of these functions in an interval of length 27 (one period) 
can easily be deduced from (17)-(20), and is shown in the following table: 


O<x<2 Nex<n eatyee LO 
2 2 z 4 
sin x + + - - 
Cos x + _ _ + 


Note that given any x such that sin x # 0, cos x # 0, the signs of sin x and cos x 
uniquely determine which of the four subintervals (of length 2/2) contains x. The 


graphs of sin x and cos x are shown in Figure 6.7 It turns out that 2 = 3.14159... 
(cf. Chapter 9, Problem 15). 


Figure 6 


5.66. Lemma. [fx and u are such that 


sin (x+u) =sin u, (23) 


cos (x-+ 4) =cos u, 
then x = 2k, where k is an integer. 


Proof. Solving the system 


sin (x +u) =sin x cos u+cos x sin u=sin u, 23’) 


cos (x+u) =cos x cos u—sin x sin u=Cos u 


for sin x and cos x, we get 


sinx=0, cosx=l. (24) 


By Sec. 5.65, the only solution of (24) in the interval 0 Sx < 27 is given by x = 
0. Hence, since sin x and cos x are periodic with period 27, the only solutions of 
(24) on the whole real line are given by 


x = 2ka(k =0, + 1, +2,...). I 


5.67. As already noted, the function sin x is increasing on the interval [0, 2/2]. 
Hence, because of the identity 


sin (—x) = —sin x, 


sin x is also increasing on the larger interval [— 2/2, 2/2], taking values from — 1 
to 1. It follows from Theorem 5.35b that sin x has an inverse on the interval [—1, 
1]. This inverse function, which we denote by arc sin x, is continuous and 
increasing on [— 1, 1], taking values from — 2/2 to 1/2. 

By the same token, the function 


cos x =sin (« + 5) (25) 
is increasing on the interval [—z 0], taking values from —1 to 1, and hence has an 
inverse on the interval [—1, 1], again by Theorem 5.35b. This inverse function, 
which we denote by arc cos x, is continuous and increasing on [— 1, 1], taking 
values from — 2 to 0. Letting u denote both sides of (25), we have 


Tl ; 
xX =arc COS u, #45 =arc sin u, 


which together imply the formula 


1 
arc cos u=arc sin ¥— =, (26) 


valid for all w © [-1, 1]. 

Alternatively, we could have defined another function arc cos u by using the 
“increasing branch” of the function cos x on the interval [0, 2]. This time the 
inverse function is continuous and decreasing on [—1, 1], taking values from z to 
0. Instead of formula (25) relating the functions cos x and sin x on the intervals 
[— a, 0] and [— 2/2, 2/2], respectively, we now have 


cos x =sin (5 -»). (25') 


Correspondingly, this formula leads to 


Tt , 
arc cos u= — —arc sin u, (26’) 
2 


instead of (26). To avoid possible confusion, we can designate the first of these 
inverse functions of cos x by arc cos,u (i for “increasing’’) and the second by arc 
cos u(d for “decreasing”’). The functions arc sin u, arc cos,u, and arc cos,u have 
the graphs shown in Figure 7. There are, of course, other functions on [— 1, 1] 
which might equally well be regarded as inverses of sin x and cos x, but we 
confine ourselves to the three functions just described. 


5.68. Finally, let 


sin x 
tanx= _ 
cos x 


The function tan x is defined everywhere except at the points where cos x = 0, 
namely, the points 


=i thn (k=0,+1,42,...). 


Clearly, 


lim tanx=-+o00, lim tanx=—oo. 
xA7nr/2 x\n/2 


X =a¥c COs, uy. 


‘\/x = arc sinu 


7 \* = arc COs; u 


Figure 7 


Figure 8 


Moreover, (19) and (20) show that the function tan x is periodic, with period 7. It 


follows from the behavior of sin x and cos x on the interval (— 2/2, 


m/2) that tan x 


is continuous and increasing on (—2/2, 1/2), taking all real values. By Theorem 
5.35c, the function tan x has an inverse on the whole real line (—«, 0). This 
inverse function, denoted by arc tan x, is continuous and increasing on (—00, 00), 
taking values from —2/2 to m/2. The graphs of the functions tan x and arc tan x 
are shown in Figure 8. 

The functions sin x, cos x, and tan x are called trigonometric functions. As an 
exercise, the reader should study the other trigonometric functions 


] l 
csc X=>—, secx= » cotx= 
sin x cos x tan x 


and their inverses. 


5.7. Applications of Trigonometric Functions 


5.71. Polar coordinates in the plane. Let (x, y) be any point in the real plane R, 
such that x* + y? 0. Then obviously 


2 2 
= a =]. 
x+y x+y 


Suppose we define the number @ in the interval 0 S 0 < 27 by the conditions 


_ ee 
neadie s went (1) 


where the square roots are positive, as always. The number @ exists and is 
unique. For x 7 0, y 7 0 this follows from the continuity and monotonicity of cos 
8 and sin @ on the interval (0, 2/2), together with the intermediate value theorem 
(Theorem 5.23) and formula (1), p. 157. In the other cases, it follows from 
formulas (17)-(20), p. 161 and the sign rules of Sec. 5.65c. The number @ is 
called the polar angle of the point (x, y). Note that 


0 if x>0, y=0, 
|e 00 if x=0, y>0, 
m= 180° ifx<0, y=0, 
irene if x=0, »<0. 


Making use of the periodicity of the functions sin 6 and cos 6, we might also 
regard @ not only as the unique solution of the system (1) in the interval [0, 27), 


but also as any real number differing from this solution by an integral multiple of 
2n. 
The number 


r= Js +9? (2) 
is called the radius vector} of the point (x, y). In terms of r and 6, we can write 
the formulas (1) as 

ee ee (3) 
The numbers r and @ are called the polar coordinates of the point (x, vy). They are 


defined everywhere except at the point x = 0, y = 0, where r = 0 but 0 becomes 
meaningless. The curve r = const is just the circle 


tyr =P, 


while the curve @ = const is the ray 


J 


< =tan @. 
x 
5.72. Trigonometric form of the complex numbers. Let z = x + iy # 0 be a 


complex number, as in Sec. 2.7. Then, using (3), we can represent z in the 
trigonometric form 


z=r(cos @+i sin 8), (4) 
where @ and r are determined by (1) and (2). In this context, 7 is called the 
absolute value or modulus of z, written r = |z|, while 0 is called the argumentt of 
Zz, written 6 = arg z. 

Suppose we use formulas (2) and (3), p. 157 to form the product z,z, of two 
complex numbers 


Z, =r,(cosé, + isin 6;), 2) =17(cos 4) + isin >), 
written in trigonometric form. The result is just 


Z,Z2=[(cos 8, cos 8, —sin @, sin 8,) +i(sin 8, cos 8, +cos 6, sin @5)] 
=r, r,[cos (0, +0,) +72 sin (6, +86,)]. (5) 


Thus multiplication of two complex numbers leads to multiplication of their 
absolute values and addition of their arguments. 


5.73. In particular, (5) implies 


z™=r™(cos m0 +1 sin m0) (6) 
for every z=x + iy # 0. Using (6), we now solve the equation 
z™ = WwW, (7) 


where w is a fixed complex number. For w = 0, (7) has the obvious (unique) 
solution z = 0. Hence we assume w # 0 and write w in the form 


w= R(cos w +i sin). 


Suppose we express z in the form (4), where r and @ are as yet undetermined. It 
follows from (6) that 


r™(cos m@ + i sin m@) = R(cos wm +i sin @), 


and hence 
r™ cosm@=Rcecosa@, r"sin m@=Rsin o. (8) 


Squaring these equations and adding them, we get the equation 


y2m = R2 


whose unique positive solution is 
r="/R. 
Dividing the equations (8) by 7” = R, we now have 


cos m@ = cos w,sin m@= sin a, 


which take the form 


cos (x + u) = cos u,sin (x + uw) = sin u 


if we set 
u-@,x+u=msd,x=md- a. 
Hence, by Lemma 5.66, 

mé — w = 2kr(k = 0, + 1, + 2, ...), 


Gag (Em 0,41,42,...). 
m m 


Choosing k = 0, 1,..., m— 1, we get m distinct solutions of equation (7): 


2,="/ R| cox( © fe | +i sin(S + ==) (9) 


m m m 


The remaining values of & lead to values of 6 which differ by an integral multiple 
of 2x from one of the values already found, and hence give no new solutions of 
(7). The numbers (9), called the complex mth roots of w (cf. Sec. 1.63) lie at the 
vertices of a regular m-gon centered at the origin, as illustrated schematically by 
Figure 9 for the case m = 6, R= 1, w = 192°. 


Figure 9 


5.74. Angle between two spatial vectors 


a. Definition. Given two vectors 


x= (Cl, ssa): y=O1 si Va) 


in the n-dimensional Euclidean space R,, let 


(x,9) = De 
k=1 


be their scalar product, as in Sec. 3.14a. Then, recalling the Schwarz inequality 


ly) S belly 


(formula (7), p. 57), we can find a unique number @ in the interval 0 <@ Sa 
such that 


Os oe Sd) (10) 


~ allo? 


provided of course that x # 0, y 0. The number w, denoted by <x, yp, is called 
the angle between the vectors x and y. 


b. Suppose w = 0, so that 
(x, 9) =[/|>I- (11) 
Then the vectors x and y are linearly dependent (see Sec. 2.64). In fact, if (11) 
holds, the equation 
O= (x—Ay,x— Ay) = (x, x) —2A(x, 9) +A7(9,9) 
= |x|? —2Alx|| | +47] 9|? = (|x| — Aly)? 


has the solution 
= |x] 
ly 


and hence x = Ay for this value of 4. Similarly, if @ = 7, 1.¢., if (x, y) = — |x|[y|, the 
vectors x and y are again linearly dependent, this time with constant of 
proportionality — A. 


c. If w = 77/2, 1.¢., if (x, y) = 0, then the vectors x and y are said to be orthogonal. 
We continue to use the condition (x, y) = 0 to define orthogonality even in the 
case where one or both of the vectors x, y equals 0. Thus the zero vector is 
regarded as orthogonal to every vector. 


d. Let x = (x, x»), y = (, ¥2) be two vectors in the plane R,, with polar 
coordinates r, yg and p, w, respectively, so that 
Xx, =rcoS@, X,=rsng, y,=pcosyw, y.=psiny. 
Then, by definition, 
con Be sien (x,y) _ rp(cos @ cos y +sin ¢ sin —) es (12) 


Thus the angle <x, > between the vectors x and y is just one of the numbers 
o-—wt2ka,w-ot+2ka (k=0,+1,+2, ...), 

namely the one lying in the interval [0, z]. 

5.75. Rotations 


a. By a rotation through the angle @ in the plane R, we mean the mapping or 
“transformation” of R, into itself which leaves the origin fixed and carries the 
vector x ¥ 0 with polar coordinates r, g into the vector x’ with polar coordinates 
r, g + @. To describe this transformation in rectangular coordinates, let 
— (¥1,%2), x’ = (x4,%2), Then 


x =r cos (pg +6) =r (cos g cos @—sin g sin @) =x, cos @—x, sin (13) 
x,=rsin (9 +6) =r (sin g cos 6+cos ¢ sin @) =x, sin 9+x, cos 0. 


By its very definition, a rotation does not change lengths of vectors. Moreover , 


a rotation does not change angles between vectors. In fact, if g and yw are the 
polar angles of two vectors x and y, then the rotation carries x and y into vectors 
x’ and y’ with polar angles g + 6 and y + 0. But the angle w between the vectors 
x and y is determined by the condition 


COS @ = cos (9 — y) 


(cf. (12)), while the angle w’ between the vectors x’ and y’ is determined by the 
condition 

cos w'=cos [(g + 8) — (w+ @)] =cos (g — y). 

Hence w = w’, since w and w’ both lie in the interval [0, z]. 


b. + More generally, we can define a rotation in the n-dimensional Euclidean 
space R,, as follows. Let A be an automorphism of R,, carrying the vector x = (x), 


..., X,) Into the vector x’ = A(x) = (x4,--..*,), Then, as in Sec. 2.67, the effect of A 


on the components of x is described by the system of “linear relations” 


, 
Hy Hy Xy Hee £44, %,, 


(14) 
X= Ay y% 5 ali +4y,Xn; 
or more concisely, 
Ay 2 Sih (j=1,...,2), (14°) 


in terms of certain real numbers Aik whose meaning is clear from formula (17) 


below. Suppose the “transformation” A does not change the lengths of vectors or 
the angles between them, and suppose further that the determinant of the matrix 
||a;||, denoted by det ||a,||, equals 1.{ Then A is called a rotation or (proper) 


orthogonal transformation in R,,. 


THEOREM. The transformation A is a rotation if and only if the coefficients aj, 
satisfy the condition§ 
Fenty =8, 4. teh (15) 
gn CU a 


(besides det ||a;4|| = 1). 


Proof. Let e, be the kth vector of the basis 
C1150 ye Dyas — (050 oie: 1) 


(cf. Sec. 2.65), with 1 as its Ath. component and 0 for all its other components, 
and let & =A(¢). Then, if A is a rotation, the vectors ¢1>---»¢, must be orthogonal 
and normalized (Sec. 3.14a), like the vectors e)..., e, themselves, i.e., we must 


have 

(¢;,¢;) =0;;- (16) 
But, by (14), 

C= (4 js-+ +s En) (17) 
(cf. p. 42), so that 


n 
= > Ay iA, jy 
k=1 


which, together with (16), gives (15). 
Conversely, suppose (15) holds, and let x’ = A(x), y’ = A(y). Then, by (14), 


(x’, 9’) -> XI = 2 12 y a8) ( pa a) 


->% (és aus )82)= YY Bury, 


— i=1j=1 
Dw = (x,y); 


so that A preserves lengths of vectors and angles between them. Hence 4 is a rota 


5.76. Polar coordinates in space. Let the basis vector e, be the same as above, 
and let a, be the angle between e, and the vector (or “point”) x = (x), ..., X,) © Ry: 
Then, by (10), 


(x, €,) i | (18) 


COS &, =Cos «x,e,> = 
Ixfleg [xl 


so that 
Xz = |x| cos (k= 1, ..., 2). 


The “polar angles” a; = &, e,», together with the “radius vector” |x|, are called 
the polar coordinates of the vector x. Note that the angles a, satisfy the 


condition 
n x2 
¥ cos? y= YS =1 
K=1 k= |x| 
Suppose two vectors x and y have polar angles aj, ...,a, and f), ..., ,, 


respectively. Then, clearly, 


(x, 9) 
|x|| »| 


If w = (@, ..., @,) 1S a unit vector, so that |@| = 1, then (18) gives 


=cos a, cos 8B, +-:++cos a, cos B,. 


cos <x, > = 


= COS Co, ep(k = Tseng BB), 


i.e., the numbers @, ..., @, are just the cosines of the angles between w and the 


corresponding basis vectors. In particular, suppose the transformation (14’) is a 
rotation. Then the coefficients a;,, being the components of the unit vectors ck, 


have a simple geometrical interpretation: aj, is the cosine of the angle between 


the basis vector & and the basis vector ej. 


5.8. Continuous Vector Functions of a Vector Variable 


5.81. By a vector function of a vector variable we mean a function f(x) with 
domain ER. and range R,, (synonymously, a mapping of E . R,, into R,,), 
where R,, and R,, are the familiar Euclidean spaces (of dimensions m and n, 
respectively) defined in Sec. 3.14a. Since R,, and R,, are metric spaces when 


equipped with the usual distance, the general definition of Sec. 5.11la takes the 
following form: The function f(x) is said to be continuous at the point x = a © E if 


lim f(x) = f(a), 


x~a 


i.e., if given any ¢ 7 0, there exists a 67 0 such that |x — a| <6 implies 


x)-fla)| <«. 


Here, of course, |---| denotes the length of the vector written between the vertical 
bars, calculated in R,, or R,, as the case may be. 


5.82. rHeoreM. Suppose the vector function f(x) is written in the form 
Kx) = F1@)s--InQ)> 

involving n numerical functions 

LX) = [Mises Hy GU = Ly eg): 


Then fix) is continuous at x =a © E if and only if each component function fi &), 
5 f,(X) 18 continuous at x = a. 


Proof. An immediate consequence of Theorem 4.75. ff 


5.83. a. rHeore. /f the vector functions fix) and g(x) are continuous at x =a © E, 
then so is their sum f(x) + g(x). 


Proof. An immediate consequence of Theorem 4.73c. fi 


b. rueorem. [f the vector function f(x) and the numerical function a(x) are 
continuous at x =a E, then so is their product a(x)f(x). 


Proof. An immediate consequence of Theorem 4.73d. ff 


c. tHEorem. Jf the vector function fix) is continuous at x = a © E, then so is the 
numerical function |f{x)|. 


Proof. An immediate consequence of Theorem 5.15b and the subsequent remark. 


5.84. Now let m = n = 2, and consider a function w = f(z) with domain E~R, and 
range R,. In this case we can regard x = (x, y) and w = (u, v) as complex 


numbers, and w = f(z) can correspondingly be called a complex function of a 
complex variable. According to Sec. 5.81, we now say that f(z) is continuous at 
the point z=a © E if 


lim f(z) =f(2), 


za 


i. e., if, given any ¢ 7 0, there exists a 0 7 0 such that |z — a| <8 implies 


lz) — fla)| <. 


Moreover, according to Theorem 5.82, the function f(z) = u(z) + iv(z) is 
continuous at z = a if and only if both real functions u(z) and v(z) are continuous 
at z = a. Invoking the results of Sec. 5.83, we see that if the complex function 
fiz) is continuous at z = a, then so is its absolute value |f(z)|, as well as the 
product a(z) f(z), where a(z) is any real function continuous at z = a, while if two 
complex functions f(z) and g(z) are continuous at z = a, then so is their sum f(z) + 
g (Z). 

Since products and quotients of complex functions are readily defined (see 
Sec. 4.72c), we have the following additional 


THEOREM. Jf two complex functions f(z) and g(z) are continuous at z = a, then so 
are their product f(z)g(z) and quotient 


£2) (g(z) #0). 
g(z) 


Proof. An immediate consequence of Theorem 4.73e. fi 


In particular, since f(z) = z is obviously continuous for all z = x + iy, it follows 
that every polynomial 


P(z) =agz"+a,2" '+---+a, (a9 #0,a,,.-.,a, EC) (1) 
is continuous on the whole complex field C, while every rational function 


Agz"+a,2" '+--+a 
Fits) mi 20% TE n 0,...,@esbo #0,....62 EC 
(z) boyz" +b, 2" 1 4-- +5,, (ag# a oF ) 


is continuous at every point z © C where its denominator is nonvanishing. 


5.85. The fundamental theorem of algebra. By a root (or zero) of the 
polynomial (1) we mean any number Zp, (real or complex) such that P(zp) = 0. 
There are polynomials (even polynomials with real coefficients) with no real 
roots at all, e.g., P(z) = z* + 1. However, as we now show, every nonconstant 
polynomial has at least one complex root, a result known as the fundamental 
theorem of algebra. 


a. Lemma. Let P(z) be the polynomial (1). Then, given any A 7 0, there exists an R 
> 0 such that |P(z)| 2 A for all |z| ZR. 


Proof. Clearly 


P(z) =agz"+a,2"~ 14+ $0,092" a ie ee ml 
Ayz Ayz 

if z 0. But the expression in brackets approaches | as z — (see Sec. 4.73f). 

Hence there is a number R, such that 


l 


a 
n 25 


ay 
1+ —+---+ - 
Agz Agz 


for all |z| 2 R,. It follows that 


|P (z)|=laoll2" 


l 
2 —|ao||z|" 
5|4oll | 


ay a, 
1+—+ +--+ 
agZ ao 


z" 


for all |z| 2 R,. Now let 


R> max Ry, “At 
|| 
Then 
\P(@)| 7 Zlaollz"| > 2laglR" > A 
for all |z|2 Rk. fl 


b. temma. Let P(z) be a polynomial (1) of degree n 2 1, and suppose P(Zo) # 0. 
Then, given any 0, > 0, there exists a point z, such that |z; — Zo| S69 and |P(z})| 
<|PGo)| 

0 


Proof. First consider the case z) = 0, P(O) = 1. Let a, be the first of the 


coefficients a .. 4g Which is nonvanishing. Then 


n-\>"* 


Pz) =1+ az*+...+agz"=1+4+ az*[1 + HO), 


where 


H(z) =“t=1z24...4 Ok 
ay ay 


Since obviously H(0) = 0, it follows from the continuity of the polynomial H(z) 
that 


|A(2)| < 

for every point z in some disk |z| <5 Oo, where o is small enough to make 
HMa,| <1. 

Let z, be a solution of the equation 


a 
zy = — glad 
a, 


so that obviously |z,| = 6 (the existence of z, follows from Sec. 5.73). Then 


|P(z)| =|] +a,24 +4244 (z,)| 
=|1 —5*|a,| — 5*|a,|H (z1)| <1 —3*|q,| + 5*|a,||4 (z,)| 
<1—5*|a,| +45*|a,| = 1 —45*|q,| <1, 


as required. 
In the general case, where P(z,) # 0, we use the formula 


P(z) = a2""*= Y agl(2— 20) + 20]"™* 


to expand P(z) in powers of z — Zp, obtaining 
P(z)= ¥ by(z—z0)"*, P(zo) =b, #0. 
Replacing z — zy) by a new variable ¢, we then get 


P(Z) = byPo(S)» 


where 


Po(f) = F204 Ope tL 


Since 

Po(O)| = 1, IP) = lballPo * Po: 

this case reduces to the one already considered. & 
c. We are now in a position to prove our main result: 


THEOREM (Fundamental theorem of algebra). Every polynomial (1) of degree n 
7 1 has at least one complex root. 


Proof. Let 


a =inf |P(z)| 20. 
zeC 


By the first lemma, there is an R such that 
P@|2a+1 


for all |z|#R. Hence the values of P(z) outside the disk |z| S R play no role in the 
calculation of a, and we can write 


a= inf |(P(z)|. 


|z|<R 


But the disk K = {z: |z| SR} is compact (by Theorem 3.96e), unlike the whole 
plane C. Hence, by Weierstrass’ theorem (Sec. 5.16b), the continuous function 
|P(z)| achieves its greatest lower bound at some point zy eK: 


|P(Zo)| = a. 


Suppose P(zZp) # 0. The point zp is an interior point of K, since |P(z)| 7Fa+lon 
the boundary of K, i.e., for |z| = R. Therefore some small disk |z — zo| S 8p lies 
entirely inside K. By the second lemma, there is a point z, in this disk such that 
|P(z)| < |P(Zp)|. But then a = |P(zp)| cannot be the greatest lower bound of the 
function |P(z)| on the disk K. This contradiction shows that P(zy) = 0 (and hence 
a=0). 


5.86. Factorization of polynomials. Given any two polynomials P(z) and Q(z) 
# 0, there are two other polynomials S(z) and R(z) such that 


P(z) =Q(z)S(z) + R(z), (2) 


where either R(z) = 0 or the degree of R(z) is less than that of Q, (z).+ In this 
context, we call P(z) the dividend, O(z the divisor, S(z) the quotient, and R(z the 
remainder. The sum of the degrees of the divisor and the quotient equals the 
degree of the dividend. Here z is simply an “unknown,” not as yet assigned any 
numerical value, and the various polynomials are formal expressions involving z 
and certain numerical coefficients (subject to the usual rules for addition and 
multiplication). Substituting some complex number for z changes the identity (2) 
into an equality between two numbers. 

Suppose, in particular, that O,(z) = z — z,; Then R(z) must be of degree zero, 


i.e., R(z) must reduce to a constant. If z, is a root of the polynomial P(z), this 
constant equals zero, since 


R(z) = R(zZ1) = P,) — S(z,)(, — 21) = 9. 
The representation (2) then takes the form 
P(Z) = S(z)(Z-Z)), 


where S(z) is a polynomial of degree n — 1. Applying the same process in turn to 
the polynomial S(z), we get 


S(z) = S\(Z) = Sy(z)(Z — 2), 


where Zz, is a root of the polynomial S, (z) and hence of the original polynomial 
P(z), while S(z) is a polynomial of degree n — 2. Repeating this process, we can 
eventually lower the degree of the successive quotients to zero, thereby 
obtaining the factorization 


P(z) =So(Z—2,)-+*(2— 2), (3) 


where Sg is a constant and the numbers z, ..., z, are the roots of the polynomial 


P(z). These roots may not be distinct. Hence, combining expressions involving 
the same root, we can write the factorization (3) in the final form 


P(z) =So(z—z,)"'++-(z— 2); (4) 


where the numbers Z,, ..., z, are now distinct. The exponents 7, ..., 7, are called 
the multiplicities (or orders) of the corresponding roots. Comparing the two 


sides of (4), we see that Sp is just the coefficient of z” in the polynomial P(z). 


5.87. Partial fraction expansions of rational functions. Given any two 
polynomials P,(z) and P,(z), we can construct their greatest common divisor 


D(z), namely, the polynomial of highest degree which divides both P,(z) and 
P,(z). It can be shown that D(z) has the representation 


D(z) = P(Z)S\(Z) + Po(Z)So(2), 


where S)(z) and S,(z) are themselves polynomials.t Obviously, every root of 
D(z) 1s a root of both P,(z) and P,(z). Therefore, if P)(z) and P,(z) have no 


common roots, their greatest common divisor is just a constant, which can be 
taken to equal 1. 


Now consider the rational function 


Q(z) 
P,(z)P3(z)’ 


where P,(z) and P,(z) have no common roots and Q(z) is another polynomial. 
Then, as just noted, there are polynomials S, (z) and S,(z) such that 


P,(Z)S1(Z) + Po(zZ)S2(Z) = 1 


and hence 
Q(z) =Q(z)P;(z)S,(z) +Q(z)P2(z)S2(z). (5) 
Dividing both sides of (5) by P;(z)P>(z), we get 


Q(z) = BEA) (z) , Q(z)S2(z) 
P,(z)P,(z) —Py(z) SP (2) 


Clearly, this process can be continued if either P;(z) or P,(z) is itself a product of 
polynomials with no common roots. For example, suppose the polynomial P(z) 
has the factorization (4). Then the rational function Q,(z)/P(z) can be written as a 
sum 


Plz) ~ Sona) ema Pi ez) 


Q(z) Q(z) £ Q(z) 6) 


Expanding each numerator Q(z) in powers of z — z; we can reduce (6) further to 


the form 


Ppa Ta+ ThE aaa (7) 


j=1 (m=1 


where 7(z) is a polynomial and the 4,,, are numbers. Formula (7) is called the 


partial fraction expansion of the rational function Q(z)/P(z). The expansion (7) 
is unique to within order of terms, a fact which justifies the use of the 


“method of undetermined coefficients” to construct (7). 


5.88. Polynomials with real coefficients. Let 


P(z) =agz" + ayz"~ ++ +4, 


be a polynomial with real coefficients, and suppose 2, is the complex conjugate 
of the number zp. Then, by Sec. 2.73, 


P (Zo) =a92" +0, 2"~! +++» +a, =ag2"+0,2" 1+++++4,=P(Zo). (8) 


In particular, let z) be a root of P(z), so that P(zy) = 0. Then (8) shows that P(2,) 
= 0, Le., Z) is also a root of P(z). 


Now let 

29 =X + ivo,? = x9 — Vo 

be a pair of complex conjugate roots of P(z). Then 

(z — 29) (2 — 29) = (& — xq — 9) @— X90 + WY) =X)? + N, 


and dividing P(z) by the quadratic polynomial (z — x9)* + 36 with real 
coefficients, we get 


P(z) = [(2 — x9)? + JOP, (2), 


where P,(z) is again a polynomial with real coefficients. Thus, combining all 


pairs of complex conjugate roots in the factorization (4), we get the following 
representation of P(z) as a product of factors with real coefficients: 


P(z) =So[(z— x4)? +9] ---[(2—p)? +p]? (Z—% p41)? * 0 (Z— Hq)". 


Here x, + ij, ..., x, + iy, are the nonreal roots and x, + 1, ..., x, the real roots of 
! gps Vg ede se Le pale that the 
complex conjugate roots x; + iy; and x; — iy; have the same multiplicity 7;. 


Similarly, suppose both polynomials Q(z and P(z) have real coefficients. Then 
we can write the partial fraction expansion (7) in the form 


2 oo Ee waa (a A 
PT O+ SS ear orl bee 


j=1 (m=i((Z—-* +9; |" ptt (m=1(Z—%;) 


P(z), with corresponding multiplicities 7,, ... 


(give the details). 


5.9. Sequences of Functions 


5.91. Let 
Si (*)sf2(%) 9-+-3 Sal) 9+ (1) 


be a sequence of functions, all defined on the same set EF and taking values in a 
metric space M equipped with a distance p. Then the sequence (1) is said to be 
convergent on E if every sequence of points 


fi(Xo), fxXo)» +++» fa(X0)> +o © BE) 


has a limit in M. This limit, which clearly depends on the point x9, will be 
denoted by /(xp). The function f(x), equal to f(x9) at every point x © E, is called 


the limit of the sequence of functions (1). We then say that (1) converges to f(x), 
a fact expressed by writing 


J a Ban], )- (2) 


More exactly, (2) means that given any ¢ 7 0 and any Xo © E, there exists an 
integer N 7 0 such that 


p(AXx9), fnl%o)) Se 
for alln 7 N. 
5.92. Examples 


a. Given a numerical function g(x), the sequence of functions 
] 
Sn(*) = ~ (4) 


converges to zero on the domain of g(x). 


b. The sequence of functions 
f(x) =x" 


is convergent on the interval — 1 <x $1. Its limit is the function 


Oif —l<x<l, 


fo)={) ifx=1. 


c. The sequence of functions 


nsin nx ifO0<x<n/n, 


x)= 
Jn\*) ‘0 ifnj/n<x<n 

converges to zero at every point x © [0, a]. Note however that the maximum 
deviation of the function f,(x) from the limit function f(x) = 0 equals n, and hence 
does not approach zero (but rather becomes arbitrarily large) as n — ©. 


5.93. a. Generally speaking, the limit function of a sequence of functions f(x) 
does not have the properties of the functions f,(x) themselves. Thus, in Example 
5.92b, the functions f(x) are continuous on (—1, 1], but the limit function is 


discontinuous on (—1, 1]. To say something definite about the limit function, it is 
usually necessary to impose some conditions on the character of the 
convergence. The following definition is one of the most important conditions of 
this kind: 


b. Definition. A sequence of functions f,(x) is said to converge uniformly on E 


to the limit function f(x) if, given any ¢ 7 0, there exists an integer N 7 0 such 
that 


p(S(x),fn(*)) <e (3) 


for all n 7 N = N(e) and all x © E. The difference between this definition of 
uniform convergence and the definition of ordinary convergence given in Sec. 
5.91 consists in the fact that here one and the same number N works for all x © E 
simultaneously, while in ordinary convergence the number N still depends on the 
choice of the point x. Hence, instead of requiring that (3) hold for all n 7 N and 
all x © E, we could just as well have required that 


sup p(S(3),fal(x)) <e (4) 


hold for all n 7 N = N(e). If the metric space M is the m-dimensional Euclidean 
space R,,, so that the functions f,(x) are vector-valued, the inequalities (3) and 


(4) become 
| f(x) fa) <8 (3’) 


and 


a | f(x) —Sfa(x)| <2, (4) 


where |---| denotes the length of the vector (function) written between the 
vertical bars. Note that the convergence is uniform in Example 5.92a, but not in 
Examples 5.92b and 5.92c. 

The utility of the notion of uniform convergence will be apparent in the 
theorems that follow: 


5.94. tHeorem. If the sequence f,(x) converges uniformly on E and if every 
function f,(x) is bounded on E, i. e., if 


pa, f,(x))<C, (n=1,2,...) 


for some fixed point a © M and all x © E, then the limit function f(x) is also 
bounded on E. 


Proof, Given any ¢ 7 0, say ¢ = 1, choose N such that 
pUf(x), fx) Se = 1 

for alln 7 N and all x © E. Then 

P(a, fix) Spa, fy + 1) + py + 1), 2) S$ Cya +1, 
so that f(x) is bounded on E. Bl 


5.95. a. rHeoreM. If the sequence f,(x) converges uniformly on E, where E is a 
metric space with distance po, and if every function f,(x) is continuous at a point 
x =a® E, then the limit function f(x) is also continuous at x = a. 
Proof. Given any ¢ 7 0, choose N such that 

p(fa(x), f(%)) <2/3 (5) 
for all n 7 N and all x © E. Since the function Jy +1 (x) 1s continuous at x = a, 
there exists a 6 7 0 such that 


P( fn +1(*)>Sn+1(%0)) <@/3 (6) 
for all x in the ball U= (x: po(x, a) < 5}. Writing the inequality (5) for the points 


aand x © Uwithn=N + 1, we get 
P( fn +1(4),f(@)) <@/3, (7) 
P( fu +1(*),S(*)) <e/3. (8) 
It follows from (6)—(8) that 


P(F(*),£(@)) SoC F() fn 1(*)) + OUfn 4 1 *)s fn + 1 (2)) 
+ P(fn+1(4),f(a)) <e 


for all x © U, i.e., fx) is continuous atx=a. Tf 


b. corottary. The limit of a uniformly convergent sequence of continuous 
functions on a metric space E is itself a continuous function on E. 


Proof. An immediate consequence of the preceding theorem and the fact that 
continuity on E means continuity at every point of E. 


5.96. Finally we prove a test for uniform convergence similar to the Cauchy 
convergence criterion of Theorems 3.72b and 4.19: 


THEOREM. A sequence of functions f(x) defined on a set E and taking values in a 


complete metric space M (with distance p) is uniformly convergent on E if and 
only if the following condition is satisfied: Given any ¢ 7 0, there exists an 
integer N* 0 such that 


nip P(Sm(*)>Sn(*)) <€ (9) 
for all m,n? N. 


Proof. Suppose f,(x) converges uniformly on E£ to f(x). Then, given any ¢ > 0, 
there is an N such that 


pfx), fy(x)) S e/2 
for alln 7 Nand all x © E. But then 
Pfin(X)» Sy) Spr 2), A) + p(x), fn) < € 


for all m, n> N and all x © E, i.e., (9) holds for all m, n7 N. 
Conversely, suppose (9) holds for all m, n 7 N. Then Fi(Xo) 18 a fundamental 


sequence for every fixed xo © E, and hence, by the completeness of MW, Fi(Xo) 


converges to a limit f(xp) in R. Let f(x) be the function equal to f{xo) for all x9 


E, 1.¢e., let f(x) be the limit function of the sequence f(x). Then, taking the limit of 
(9) as m — 0, and using the continuity of the distance p(Sec. 5.12b), we get 


pa P(S (x), Sa(*)) <e 


for all n 7 N, so that J, (x) converges uniformly on E to f(x). | 


Problems 


1. Given a numerical function f(x) defined in a neighborhood of a point xo, 
suppose that for every 5 7 0 there exists an ¢ 7 0 such that |x — xo| < 6 implies 
Kx) —fxo)| < ¢. Is fx) continuous at x = Xo? 

2. With the same f(x) as in the preceding problem, suppose that for every ¢ 7 0 
there exists 5 7 0 such that |f(x) — {xo)| < ¢ implies |x — Xo < 5. Is Ax) continuous 
atx =X? 

3. Prove the continuity of the function y = |x]. 

4. Investigate the continuity of the functions y = [x] and y = (x), where [x] is the 
integral part and (x) the fractional part of x, as in Sec. 1.71. 

3. Let 


J(*) = 


1/n if x=m/n is a fraction in lowest terms, 


0 if x is irrational. 


Prove that f(x) is continuous at irrational points and discontinuous at rational 
points. 

6. Given two continuous numerical functions f(x) and g(x), prove that the 
functions 


max if(x), g(x)},min {f(x), g x)} 


are also continuous. 


7. Let f(x) be a continuous numerical function on [a, b], and let x), ..., x, be 


n 


arbitrary points in [a, b]. Prove that 
flee) = —[fle1) + + fe] 


for some point xo = la bi), 


8. Prove that if a bounded monotonic function f(x) is continuous on a finite or 
infinite interval (a, b), then f(x) is uniformly continuous on (a, 5). 
9. Let f(x) be a mono tonic function on (—%, 00) satisfying the functional equation 


fx + y) = fix) +f). 


Prove that f(x) is of the form 


f(x) = ax. 


10. Find the parts of the extended real line on which the sequence of functions 


| 
S.(*) = 14x 


is uniformly convergent. 

11. Prove that formula (4), p. 158 is not a consequence of formulas (1)-(3), 
while formula (1) is not a consequence of formulas (2)-(4). 

12. Let f(x) be a function defined on (— ©, 00) such that given any two points x,, 


xX» with x, <x and any c between f(x,) and f(x), there is a point x, ©s (x), x) at 
which f(x,) = c. Is f(x) continuous? 
13. Verify the following table: 


x sin x cos x tan x 


x/6 1/2 J3/2 3/3 
n/4 2/2 /2/2 l 


x/3 /3/2 1/2 3 


14. Let f(x) be a numerical function defined on a closed interval [a, b]. Prove that 
there are at most countably many points c © [a, b] at which lim,_,. f(x) exists and 


is different from f(c). 
15. Let ¢ = 0. t)to...t,... be the decimal representation (Sec. 1.77) of the real 


number ¢ (0 <¢ <1). Let 1), ny, ..., m,, ... be an increasing subsequence of the 


sequence of natural numbers 1, 2, ..., ”, ... Investigate the continuity of the 
function 


x(t) = 0. til ay) wal goons 


16. Let T’ be the circle of unit radius centered at the origin of the plane R,, and let 
M be the set of all points of T with polar angles 1, 2, ..., n, ... Prove that M is 


dense in I’. 

17. Suppose f(x) is continuous on a set P which is dense in a compact metric 
space M. Prove that f(x) has a continuous extension onto M (i.e., that there exists 
a function g(x) which is continuous on M and coincides with f(x) on P) if and 
only if f(x) is uniformly continuous on P. 

18. Suppose f(x) is nondecreasing on the interval (a, 5). Prove that f(x) can have 
no more than countably many discontinuity points. 

19. Suppose we drop the requirement that det ||a,,|| = 1 in the definition of a 


rotation in the space R,, (see Sec. 5.75b). Prove that the resulting transformation 


A is either a rotation or a rotation combined with a reflection, more exactly the 
result of consecutive application of a rotation and the reflection 


t , a eee 
Hy HX yy 00 Xq— gy =Xq— 19% = 


+ The symbol = means “‘is identically equal to.” 

+ In the language of Sec. 2.81, y = f(x], x2) is a rule associating an element y € P with every pair of points 

x] EM, x2 E Mp. 

+ That is, an interior point of [a, b]. 

. To treat the case f(a) > 0, f(b) < 0, either consider the function — fix) or else let c = sup {x E fa, bl: Kx) 
0}. 

+ Ugl and Ugr can be regarded as “one-sided neighborhoods” of the point a. 


+ More exactly, f(x) is nondecreasing in the direction F {x 7 xo}, where E {x 7 x9} denotes the intersection 
of the set E and the sets of the direction x 7 x9. However, since all sufficiently small intervals of the 
direction x 7 xg are contained in E, we can replace E {x a xo} by x r xq (cf. Sec. 4.16b). 


+ The expression a, is often called “a raised to the xth power.” 


+ Cf. Sec. 4.65, especially the concluding remark. 

+ Cf. formula (8), p. 105. 

+ For a justification of the “concavity” exhibited by the graphs of sin x and cos x, see Example 7.53c. 
+ A better (but nonstandard) term would be “polar distance.” 

t Not to be confused with the concept of the argument of a function (Sec. 2.81). 

+ This section, like Sec. 2.67, presupposes a passing acquaintance with linear algebra. 

t Note that the matrix of the transformation (13) has determinant 

cos@ —sin@ 


= 2 oy | — 
sin 8 on =cos? §+sin? @=1, 


As an exercise, the reader should verify that when n = 2 the definition of a rotation in R, reduces to the 
definition of a rotation in the plane given in Sec. 5.75a. 
§ The useful symbol Oi is called the Kronecker delta. 


+ Cf. formula (4), p. 56. 


t See e.g., G. Birkhoff and S. MacLane, A Survey of Modem Algebra (third edition), The Macmillan Co., N. 
Y. (1965), p. 65. The polynomials S(z) and R(z) can be found by formal “long division” of P(z) by Q,(z). 


+ Birkhoff and MacLane, Survey of Modern Algebra, p. 71. 


+ See e.g., G. M. Fichtenholz, The Indefinite Integral (translated by R. A. Silverman), Gordon and Breach, 
N. Y. (1972), Sec. 8, esp. p. 49ff. 


6 Series 


6.1. Numerical Series 


6.11. a. Basic definitions. Given a sequence of real numbers 
Bh agen agg eny (1) 
by the partial sums of (1) we mean the quantities 


5S, =a, 
So = ay + Ay 


= Gy TOs FT ne F Ops 


Suppose the sequence of partial sums 

Be ineigens (2) 
converges to a finite limit s. Then the expression 

@; +4,+---+4,+---; (3) 
called a (numerical) series,} is said to be convergent (or to converge), with sum 


s= lim s,. 

n> 2 
If, however, the sequence (2) diverges, then the series (3) is said to be divergent 
(or to diverge), with no sum at all. The numbers (1) are called the terms of the 
series (3), and every finite sum 
an+1 pees ay 
is called a section of the series. The series (3) is called positive if the numbers 
(1) are all positive, and similarly with the word “positive” replaced by 
“negative,” “nonpositive,” or “nonnegative.” The series (3) is also written more 
concisely as 


Y an: (3’) 


b. Example. Consider the geometaric series 
Dex? pee px) pee, (4) 


where x is a fixed number. The partial sums of this series are just 


n~-1 ]—x" 
= = 
n & l—x’ 


by a familiar formula of high school mathematics. If |x| < 1, then x" > 0 as n > 
00, So that 


(5) 


s= lim s,= 


n-+ 00 Ls 


Thus, if |x| < 1, the series (4) is convergent with sum (5). If x = 1, then obviously 
s,=1lt+-:::+1=n, 

so that (4) diverges. For x = — 1, we have 

§; = 1ysp = 0, 5, = 1,54 = 0,28, 


so that the sequence s), 55, ..., although bounded, has no limit, and the series (4) 
is again divergent. Finally, if |x| 7 1, the quantity |s,,| increases without limit as n 


— oo, so that (4) diverges once again. 
Summarizing these results, we see that the geometric series (4) converges to 
the sum (5) if |x| <1 and diverges if |x| 2 1. 


c. Applying the Cauchy convergence criterion (Theorem 3.72b) to the sequence 
of partial sums (2), and bearing in mind that 


SS ~ Opa pr oe, 


ifn 7 m, we get the following Cauchy convergence criterion for series: The 
series (3) converges if and only if given any ¢ 7 0, there exists an integer N7 0 
such that 


ati ts +a, <8 


for alln? m2 N. 


d. In particular, if the series (3) converges, then, given any ¢ 7 0, there exist an N 
such that 


la,| Se 


for alln 7 N, i.e., 


lim a, =0. 


n~* oc 


Thus the sequence of terms (1) of a convergent series (3) converges to zero. 
Therefore a, — 0 is a necessary condition for convergence of (3). However, 


there are also divergent series such that a, — 0 (see Sec. 6.15b), and hence a, — 
0 is not a sufficient condition for convergence of a series. 


e. Another necessary condition for convergence of a series follows from 
Theorem 3.33b, which asserts that every convergent sequence is bounded. 
Therefore the sequence of partial sums (2) of a convergent series (3) must be 
bounded. This necessary condition for convergence of a series also fails to be 
sufficient. For example, the partial sums of the series 1 — 1 + 1— 1+ --: are 
bounded, but the series is obviously not convergent. 


6.12. In Secs. 6.13-6.17 we will study nonnegative series, 1.e., series all of 
whose terms are nonnegative. The partial sums s, sy, ... of such a series form a 


nondecreasing sequence. It follows from Sec. 4.63c that if the partial sums sj, 
So... of a nonnegative series are bounded (as n — 0), then the series converges. 


6.13. a. rHEorEM (Comparison test). Let 
ay tayt ary dP eee 
be a convergent nonnegative series, and let 
by +b, 4+--+b,+°" (6) 
be any nonnegative series such that 


b.S 


n “Cay 


for some constant c and all n exceeding some integer N. Then (6) is also 
convergent. Proof If 


S= 0) aye = Oy ee O, 


then 


Oy= By to $by <b, Ho bby bey gy to Hay) 
<b, + +by +e 


for alln 7 N. Now apply Sec. 6.12. fl 

b. Example. If the terms of a nonnegative series (6) satisfy the condition 

b, <c0"(c7 0,056 1,n7 N), 

then (6) converges. To see this, set a, = 6" in the comparison test and then use 
Example 6.11b. 


6.14. The following two tests for convergence are consequences of the 
comparison test: 
a. rHeorEM (D’Alembert’s test).+ The nonnegative series 


@,+a,++-+a,+--: (7) 
converges if 


fim tt! <1 (8) 


and diverges if 


Gn+1 


lim —2*! > 1. (9) 
n—-+ 00 a, 
Proof. If (8) holds, then 
Gn+1 <g 
a as 


n 


for some 6 < 1 and all x starting from some integer N. Therefore 
an+] ~ day, aN +25 < Oay + l < (an, wy AN 4 < Fan, eas 


and the convergence of (7) follows from Example 6.13b. On the other hand, if 
(9) holds, then 


for all n starting from some integer NV. Therefore 
> > 
Gyet” A NrGNs 2 “Ns To 


so that the terms of (7) do not approach zero. But then (7) diverges, by Sec. 6.1 1d 


b. Example. The series 


e2 . 1:23 . 1°2:3+4 


Pato Ree ES Riek 
“Ta Er 1-3-5-7 


+++ 


converges, since 


G4, n+l _! 


a, Wwtl 2 


c. THEOREM (Cauchy’s test).t The nonnegative series (7) converges if 


lim °/a,<1 (10) 


and diverges if 
lim °/a, > 1. (11) 


n-* co 


Proof. \f (10) holds, then 

r/a,<0 

for some 6 < 1 and all x starting from some integer N, so that 
a, < O"(n N), 


and the convergence of (7) follows from Example 6.13b. On the other hand, if 
(11) holds, then 


x/a,> | 
for all n starting from some integer JN, 1.e., 


a,” \(n? N), 


so that the terms of (7) do not approach zero. But then (7) diverges, by Sec. 6.1 1d 
i 
d. Example. The series 
SS EP aN 5. 
1° \3 5 7 
converges, since 
af 
" Qn—-1 2 


6.15. Next we prove another useful convergence test due to Cauchy: 


rueoren. If a, 2 a, 2 ...2 a, 2 ...2 0, then the series 
ao 
La (12) 
n=1 
converges if and only if the series 
> Q* aon (1 3) 
k=1 


converges.+ 


Proof. Consider the partial sum a, + --: + a,, and choose & such that 2* ” 


Then 

@y + o-* $4, 4, + + AzK-4 
=a, + (a, +a3)+°" + (Qyx-1+°* +4 x_;) 
Sa, +2a,+>--- +2* ans . 


nN. 


If the series (13) converges, then the expression on the right cannot exceed the 
sum of (13). But then the partial sums a, + ... + a, are bounded, so that (12) 


converges. 
Conversely, choosing k such that 2“ < n, we have 


Ay +e +4, D0; ++ +A 
=A, +42 + (43 +04) Hore + (Gge-1 4 +++. +42) 
>ta, +a, +2a,+---+2*- 1a 
=}(a, +20, +4a,+---+2*ax). 


Hence if (13) diverges, so does (12). ff 


Examples 


a. The series 
= 7 


a=1 n? 


(14) 


converges if p 7 1 and diverges if 0 <p <1. In fact, the terms of the series are 
nonincreasing if p 2 0, so that Theorem 6.15 is applicable. Here the appropriate 
series (13) is 


Ez =F revo Fa, 


k=1 k=1 


where 0 = 2!?, and the result follows from Example 6.13b together with the fact 
that 2'~P <1 ifp7 1 while 2! ? 2 1 if p S1. Note that 


a 
(n+1)? n? (=) 
n 


(as n — ©), so that D’Alembert’s test is inapplicable in this case, 


b. In particular, the harmonic series 


diverges. The designation “harmonic” stems from the fact that every term of the 
series starting from the second is the harmonic meant of the two neighboring 
terms. 


c. The series 
be ] 


Zilog ny OP") a) 


converges if p 7 1 and diverges if 0 < p <1. In fact, since the logarithm is 
increasing, the terms of (15) are nonincreasing, so that Theorem 6.15 is 
applicable. The appropriate series (13) is now 


eo y oc 1 oo 1 
2%, Flog, 2° ~ 2, (og, 2)? ~ i ay? 


and the convergence behavior of (15) follows at once from that of (14). 


6.16. Using the series (14) and (15) in the comparison test (Theorem 6.13a), we 
can now establish the convergence or divergence of many kinds of series for 
which D’Alembert’s test and Cauchy’s test fail to work. We begin by showing 
that the comparison test continues to work if we compare ratios of terms rather 
than the terms themselves. 


Lemma. Given two nonnegative series 


u, (16) 


and 
yd, (17) 


suppose the inequality 


Untt <2n+1 (18) 


u, v 


holds for all sufficiently large n. Then if (17) converges, so does (16). 


Proof. Suppose (18) holds for n = N, N + 1, ... Then, multiplying the left-and 
right-hand sides of (18) written for = N, N+ 1, ..., N+ p, we get 


UN+p < PN+P 
Uy UN 


or 


u 
ly +p S— IN 4 p- (19) 
UN 


Since (19) holds for all p = 1, 2, ..., the convergence of (16) follows from that of 
(17), by the ordinary form of the comparison test (Theorem 6.13a). 


6.17. a. rHeorem (Raabe’s test). Let (16) be a nonnegative series such that 


Un+1 0E 


niimos 


u, n 


> 


where E > 1 asn— © (cf. Sec. 4.39). Then (16) converges if 57 1 and diverges 
if +1. 


Proof. Choosing 


n 


by Secs. 5.59e and 5.59f. If 6 <1, let p © (1, 6). Then 


Ua+1 OF _, _ PE _ msi 
u n v 


n n 


By Example 6.15a, the series (17) converges since p 7 1, and hence so does (16), 
by the lemma. On the other hand, if 5 < 1, let p © (6, 1). Then 


and if (16) converged, then so would (17), again by the lemma. But (17) 
diverges since p * 1, and hence so does (16). 


b. Example. Consider the series 
< “(a+ 1)---(2+n—1) 
1 BB+1)--(B+n=1) 

In this case, 


(a,8 %0, —1,—2,...). (20) 


pes tn _1+(a/n)_, Pra, 


and hence, by Raabe’s test, (20) converges if 8 — a7 1 and diverges if B-—a <1. 


6.2. Absolute and Conditional Convergence 


6.21. We now turn to more general series, with terms that are not necessarily all 


of the same sign. Thus let 
ay +a, ++ +a,+-+ (1) 


be a series whose terms are arbitrary real numbers, and consider the related 
nonnegative series 


|ay|+|a2|+---+[a,| +--°- (2) 
THEOREM. Jf the series (2) converges, then so does the series (1). 
Proof. Suppose (2) converges. Then, given any ¢ 7 0, there is an N such that 
ays too + lay Se 


for all n 7 m 2 N, by the Cauchy convergence criterion (Sec. 6.11c) applied to 
(2). But then 


l@m+1 ee + a, S lan +1 a aay |a,,| a 
for all n 7 m 2 N, and hence (1) converges, by the Cauchy convergence criterion 
applied to (1). 


6.22. Definition. A series (1) is said to be absolutely convergent if the series (2) 
converges. It may turn out that the series (1) converges while the series (2) 
diverges (an example is given in Sec. 6.24). We then say that the series (1) is 
conditionally convergent. 


6.23. rHEorEM (Leibniz’s test). /f 


a oe 


2... 
a, 7 a= a 


ay, 
and a, — 0, then the “alternating series” 
a, — a, + a,-—agt-": 

converges. 


Proof. Clearly, 


San4+1 520-1 — Gon t F254 1 SS2n-15 


S2n+2 =S$2_ + G2_41 — 429-2 2 San 


so that the sequence s>, sy, ... is nondecreasing, while the sequence sj, 53, ... 1S 
nonincreasing. Moreover, 


fk: &.£ = G..& 
SQ S840 S89n S Son F on 41 = Santa SS 82K +1 


for any k and n 2 k, so that the sequence S59, S4, ... 1S bounded from above by any 
number s>, 4. Therefore 


€= lim 52, <S2n414 
n= oO 


for any k. But then lim 


n> © 82n 18 a lower bound for all the s5, , ,. Therefore lim 


Son +1 xXists and 


n— 0 


€= lim s,,< lim 5,,4,=7. 
no noo 


Finally 

&.—-¢ ee eS 
OS 4-6 3 82n4+2 Son = nt 
and hence, since a>, 4; — 0, 


€=n= lims, 


(cf. Sec. 4.64). li 
6.24. Examples 


a. It follows from Leibniz’s test that the series 


fer ipeeseenh 

a (ae 
converges for a 7 0. For a 7 1 the series is absolutely convergent (by Example 
6.15a). For 0 <a $1 it is only conditionally convergent, since the corresponding 
series of absolute values 

> i 2 a 


l+—+—+—+- 
yang 3* 4* 


diverges (again by Example 6.15a). 


b. In particular, the alternating series 


dl wes 
2 3 4 


converges. The sum of this series turns out to be In 2 (see Example 9.104a). 


c. Consider the series 


¥ (=1)" ay 


= 
i] 
~ 


where 


we a(a+1)---(a+n—1) 
" B(B+1)---(B+n—-1) 
This series is absolutely convergent for 8 7 a + 1 (by Example 6.17b). As we 


now show, the series is (conditionally) convergent for B 7 a as well. In fact, 
since 


(a,B #0, —1,—2,...). 


+1 1 _ B-%p 
n 


uy, 


(as already noted), 8 7 o implies Unt] < u,, at least for sufficiently large n. 
Leibniz’s test is now applicable, provided that u,, — 0. But, by formula (33), p. 
136; 


intett =n (1-P4p) Pep (E'—1 as n> 00), 


u u u — 
ee a ee er fo =a 
n “ —* +In z (B U5 


Since the harmonic series diverges (Example 6.15b), it follows that In uw, , ; — — 
oo and hence u, , ; — 0, as required. 
6.3. Operations on Series 


6.31. By the sum of two numerical series 
a, +a,+e+a,++, (1) 


by +b +0 +b,4°", (2) 
we mean the series 

(4, +5,) + (4, +62) +--+ +(a,+5,) +++. (3) 
Similarly, by the product of the series (1) with a number a, we mean the series 

a, +a, ++++0a,+-. (4) 


These formal definitions, which say nothing about the convergence of the series 
in question, are the only natural ones, as shown by the following 


THEOREM. If the series (1) and (2) are convergent, with sums A and B, then the 
series (3) and (4) are also convergent, with sums A + B and aA. 


Proof. Let 


But A, — A, B, — B, by hypothesis, and hence 4,, + B, > A+ B, aA, > aA. 
6.32. By the product of two series (1) and (2) we mean the series 
a,b, + (a,b, +4,b,) + (a,b, +426, +435,) +++, (5) 


where again nothing is said about convergence. The suitability of the definition 
(5) is shown by the following 


THEOREM. [f the series (1) and (2) are convergent, with sums A and B, and if at 
least one of the series (1) and (2) is absolutely convergent, then the series (5) is 
convergent, with sum AB. 


Proof. Let 


Suppose the series (2) is absolutely convergent, so that 
k=1 
The numbers A, are bounded, since A, A, ... is a convergent sequence. Let 
M=sup |A,|. 
n21 
The numbers 
An —Aml S1Ayl + Agl(on, 2 = 1, 2, ...) 
are bounded by 2M. Given any ¢ 7 0, let N be such that 
2% x |b, | <5 a 


oT k=N+1 
for alln 7 m2 N. Then 


|A,B, —C,| =| (a2 +++ +a,)b, + (a3 + ater +4,)b,-,+°* +a,5| 
<|a,+-- +a,||5,| +a; oe +,||5,—1| 7 + |a,||5>| 


é & € 
<2M (by 4 sl-+---+10ql) + 555 (Ibal +--+ 1byl) <5 +5 8 


for all n 7 N. It follows that 
lim C,= lim A,B,=AB. § 


n— co no 


The theorem is no longer true if neither series (1) or (2) is absolutely convergent 
(see Problem 5). 


6.33. We now consider series obtained by “grouping together” terms of a given 
series. 


THEOREM. Given a series 


@y +a, 4° +a,t-", (6) 


let m,, Mp, ... be any increasing sequence of positive integers (m, = mM <...), and 
let 


ay =a, 5 ee +4, 


t2=4,,41 7°" +45, 


y= Oy 1 +7 + gy 


Suppose the series (6) is convergent, with sum A. Then the series 


Op Oly tore ty bees (6’) 
is also convergent, with the same sum A. 
Proof. We need only note that 


Ay FA to" FA, =a to Fay , 


where the right-hand side approaches Aasn > 0. fl 


6.34. In general, convergence of the “grouped series” (6') does not imply 
convergence of the original series (6). For example, the series 


1-1+1-1+-- 
is obviously divergent, but the grouped series 
(l1-1)+(1-1)+- 


is convergent (with sum 0). However, under certain conditions, convergence of 
(6) does imply that of (6), as shown by the next two theorems: 


a. rHEorEM. Jf the series (6) is nonnegative and if the series (6') is convergent, 
then the series (6) is also convergent. 


Proof. In this case, 
a,+-+++4,<4, 5 +4, +:°:>+4,, 


oo 
=A, +e +a,< Ym, 
k=1 


so that the partial sums of (6) are bounded. Now use the result of Sec. 6.12. Il 


b. tHrorem. If a,, — 0 and if the number of elements in each group is bounded, 
ie., ifm, — m, — 1 “&M for some constant M > ( and all n, then convergence of 


the series (6') implies that of the series (6). 
Proof. Given any n, let m; and m, 4 , be such that 


< 
m,n Sms, 


and let 
Dy ye PO Fae Og 


Then clearly 


1S, — O%] =|@nn +1 + °° +4,| <M max la,|, (7) 


j>m, 


where the right-hand side of (7) approaches zero as k — , 1.e., the sequences of 
partial sums 6; and s,, have the same limit. Therefore if (6) converges, so does (6) 


6.35. Next we consider series obtained by rearranging the terms of a given 
series. 


THEOREM. Given a series 
a, +4, +++4,+°, (8) 


let mj}, M, ... be any rearrangement of the sequence of natural numbers 1, 2, ..., 


and let 
0 = An (n = 1,254): 


Suppose the series (8) is nonnegative and convergent, with sum A. Then the 
series 


by +b.+---+b,4-- (8’) 
is also convergent, with the same sum A. 
Proof. For every partial sum 

B,=6, +--+, (9) 
of the series (8’), there is a partial sum 


Anny = 1 + °°* + &mcn) (10) 


of the series (8) which contains all the terms of (9). In turn, we can find a partial 
sum 


Buy = 8, + + By + + bya 


of the series (8’) which contains all the terms of (10). Since a, — 0 for all n, we 
have 


By <Amny < Brin): (11) 
The first inequality in (11) implies 


B, S Annin) <A, 


where A is the sum of the convergent series (8). It follows that the series (8") is 
also convergent, with sum B, say. Obviously m(n) and M(n) are no smaller than 
n, and hence become arbitrarily large as n — o. Taking the limit as n — © in 
(11), we get 


BSASB, 
andhenceB=A. ff 


6.36. rHeorem (Dirichlet). Jf the series (8) is absolutely convergent, with sum A, 
then the series (8') is also absolutely convergent, with the same sum A. 


Proof. The absolute convergence of (8') follows from the preceding theorem. 
Given any ¢ 7 0, let N be such that 


|Ay — A] “2/2 
and 
m+ _ + a, =e? 


for all n 7 m2 N. Moreover, let Po be such that every partial sum B,, of the series 
(8’) contains the first N terms of the series (8) if p 7 Po. Then, if p > Po, the 
difference B, — Ay contains only terms of (8) with indices greater than N. It 
follows that 


IB, — Al Se/2 


and hence 
IB, — A| S|B, — Ay + [dy 4] Se 
for all p * po, which implies B, > 4. fl 


6.37. Things are entirely different in the case of a conditionally convergent 
series, as shown by our next 


THEOREM (Riemann). Given a conditionally convergent series (8), let a and B (a 

S f) be any two numbers (in the extended real number system). Then there exists 

a rearrangement (8') of the series (8), with partial sums B,, such that 

lim B,=o, lim B,=8. (12) 
n— 0 


Proof. The proof will be given in several steps. 


Step 1. Since the series (8) converges, its terms approach zero as n — oo. Hence 
|a,,| can exceed any given € > 0 for only a finite number of indices n, so that any 
set of terms of (8) has a term of largest absolute value. Separating positive and 
negative terms in (8) and choosing first the largest term (in absolute value), then 
the next largest, and so on, we form two series, the series 


Py that tpt (13) 


made up of all positive terms of (8), arranged in decreasing order, so that p, 2 Po 


2 --- and the series 


Ggtdate+9,t°° (14) 


made up of the absolute values of all the negative terms of (8), also arranged in 
decreasing order, so that g, 2 q) Zee, 


Step 2. Both series (13) and (14) diverge. In fact, if (13) and (14) both converged 
to sums P and Q, respectively, then the original series (8) would be absolutely 
convergent, contrary to hypothesis, since no partial sum |a,| + --- + |a,| could 
exceed P + Q. On the other hand, suppose one of the series (13) and (14), say 
(14), converged, while the other, say (13), diverged. Then, since the partial sums 
A,, of the series (8) include more and more terms of the series (13) and (14) as n 
increases, we could choose n such that the sum of the terms of (13) make an 
arbitrarily large contribution to A, while the sum of the terms of (14) make only 


a bounded contribution. But then 4, — ©, contrary to the assumption that (8) is 


(conditionally) convergent. A similar argument (with a sign change) shows that 
it is impossible for the series (13) to converge while the series (14) diverges. 
Thus both series (13) and (14) diverge, as asserted. 


Step 3. We now use the terms of the series (13) and (14) to construct a new 
series, according to the following rule (in the case where a and f# are finite): The 
partial sums P,, of the series (13) become arbitrarily large as n — ©, and hence 


we can find a 7, such that 

Det Dai Ppt tg 

(setting n, = 1 if p, > B). Next we find an m, such that 

pi et pep Gyorgy a Mp Ep gs mg ag, 


then ann, 7 n, such that 


Pet! tho — Im; +Pny+1 ee +Py,-1 
SB <py te +Pa,-% ee Guna t Petit’? tim» 
then an integer m, 7 m, such that 
py ape tae +P, 9; =" “Gy, Fhe **" Phe tae "the 
KOS Py + °° +Pay 91 Gon HP FP 
—~ Yun 1 I — 13 


and so on indefinitely. This gives a series (8') which is some rearrangement of 
the original series (8). 


Step 4. The points a and f are limit points of the sequence B,, of partial sums of 
the series (8') just constructed. In fact, the partial sums of (8’) with indices 


Ny, "9 +m, _ is ees n+ My 1 1. ais 


differ from the number / by no more than p,1, D,2, --- Pye ---» Tespectively, where 
these quantities approach zero, while the partial sums of (8") with indices 


ny a M,, "9 i? IN, -1-9 Ng at MN jy -+- 


differ from a by no more than q,.1. Gna. + Imo ++» Tespectively, where these 


quantities also approach zero. Moreover, there are no limit points of the 
sequence B,, outside the interval [a,/p]. In fact, if y > B (say), let y-B=h7 0. 


Then there is an index n, such that p,, < h/2, so that all partial sums B, with 
indices greater than n, + m,_ , — 1 do not exceed 


h 
B +f, < B 5 9 ) 
and hence do not get closer than //2 to the point y. 


Final step. In the case of finite a and £, the theorem follows at once from the 
two italicized assertions in Step 4. In the case where one or both of the numbers 
a, P is infinite, only a slight modification of the proof is required (left as an 
exercise for the reader). 


coroLiary. Given a conditionally convergent series, let C be any number in the 
extended real number system. Then there exists a rearrangement (8') of the 
series (8) with C as its sum.7 


Proof. Merely choose a= = C in Riemann’s theorem. I 


6.4. Series of Vectors 


6.41. The definitions of Sec. 6.1la for numerical series have natural analogues 
for the case of series whose terms are vectors of the space R,,. Thus, given a 


sequence of vectors 
By Baye ccsBugeeiy (1) 


all belonging to the space R,,,, by the partial sums of (1) we mean the vectors 


Sp HQ € Ry 


52=a,+a,€R,, 


Sp=Q,+a,+--+a, € R,,. 


Suppose the sequence of partial sums 


a (2) 


converges to a vector s © R,,,. Then the expression 


Oy +a, +++ +a,+- (3) 
is said to be convergent (or to converge), with sum 
s= lim s,. 
noo 


If, however, the sequence of vectors (2) diverges, then the series (3) is said to be 
divergent (or to diverge). 


6.42. THEOREM. Let 
ay = (44 1,4125--+:2im)s 
Az = (42 4,4225---:42m)> 


a, = (ay 94n29°° sun) 


Then the series (3) is convergent with sum s = ($1, $9, ..-» Sm) if and only if 

Ayy Haag te +4 te =H; 

Ay2 $422 + °° $42 + °° =52; 

Bim t Fam oe* Foam tet = Sm 

i. e., and only if each “component series” of (3) is convergent, with the 
corresponding component of s as its sum. 

Proof. An immediate consequence of Theorem 3.32f. ff 


6.43. rHrorem. (Cauchy convergence criterion for vector series). The series (3) 
converges if and only if given any € 7 0, there exists an integer N7 0 such that} 


ana t oo +a) Se 
for alln? m2 N. 


Proof. Use Theorem 4.74, taking account of Sec. 6.11c. fl 


In particular, if the series (3) converges, then, just as in Sec. 6.1 1c, 


lim a, =0. 


6.44. tHeorem. [f the numerical series 
la,|+|a,|+---+|a,|+-- (4) 
converges, then so does the vector series 
ay tdayto+a,tor. 
Proof. Identical with that of Theorem 6.21. fi 


A vector series (3) is said to be absolutely convergent if the numerical series (4) 
converges. If the series (4) diverges while the series (3) converges, we say that 
(3) is conditionally convergent. 


6.45. Theorem 6.31 on “term-by-term” addition of series and multiplication of a 
series by a number remains valid for vector series. However, Theorem 6.32 on 
multiplication of two series is in general meaningless, since multiplication of 
vectors is undefined (however, see Sec. 6.46). Theorems 6.33 and 6.34b on 
“grouping together” of terms of series and Theorem 6.36 on rearrangement of 
series continue to hold in the case of vector series. The appropriate 
generalization of Riemann’s theorem (Theorem 6.37) requires a terminology all 
its own, as developed in Problems 17—22. 


6.46. Series with complex terms, being vector series in R,, are included in the 
present scheme. However, since multiplication of complex numbers is a 


meaningful operation (Sec. 2.71), we now have the appropriate generalization of 
Theorem 6.32, besides the results of Sec. 6.45: 
THEOREM. Given two series 
a, +4, ++ +4,+°*, (5) 
by tbat +b, +e (6) 
with complex terms, suppose (5) and (6) are convergent, with sums A and B, 


while at least one of the series (5) and (6) is absolutely convergent. Then the 
series 


a,b, + (a, by + ayb1) + (ayb3 + ayby + a3b,) + °° 


is convergent, with sum AB. 
Proof. Identical with that of Theorem 6.32. fl 


6.47. To study nonabsolutely convergent vector series, we will use the Abel- 
Dirichlet test (Theorem 6.47c). This test is based on the following special 
transformation of finite vector sums: 


a. THEOREM. (Abel’s transformation). Given n numbers a), ..., a, and n vectors 
by, gsi bs let 


a =A, —-, Oy = Oy — Az yee, Op 1 =A_y—Ay—45 
B,=),, B, =b, +5,..., B,=6, +b, ++ +5,. 
Then 


n-1 


Ywh= 0B, Y a8, (n=2,3,...). (7) 


Proof. The proof is based on induction. Formula (7) is obvious for 1 = 1, since it 
then reduces to 


aby + aby = a (by + by) — (ag — dy. 


Suppose (7) holds for an integer n. Then 
at+tl n n-1 

0,5, = p> Ody + On + 10941 = 2B, — ZF +094 19 n 415 
k=1 =I = 


and to show that (7) holds for n + 1, we must prove that 


n-l 


1,8, — 2» 0 By + Oy + 1On+ 1 = O44 1B n+1 b a, B, — a non (8) 


Le., that 
= ee, = = = 
Og VO, 44 Ope, — Oy i Bias Op By = On +4 (On +1 An) By. 


Dropping the term a,,B,, from both sides and then dividing through by a,, , ; we 
get the formula 


bas 1 =B,41—-B,, (9) 


which is obviously true. To complete the proof, we need only reverse the steps 
leading from (8) to (9). 


b. corotiary. Ifa, 7 a2 + 2 a, 2 0 and 


|B,| SC(k= 1, ..., n), 


then 
2, tH <2Ca,. (10) 
Proof. Here 


\a,B,| < Ca, Ca),a', <0, 


and 
na-l n-l na-l 
Y HB =| Yo (— 4) By] <C YS (—a4) 
k=1 k=1 k=1 
=C[(a, — 2) 5 ab cate 3 (a, 1 —d,)] =C(a, —C,) <Ca;, 
so that (7) implies (10). 


c. rHEoREM. (Abel-Dirichlet test). Given a numerical sequence a, and a vector 
sequence b, © Ri», suppose a, “ 0 while the sequence B, = b; + - + b, is 
bounded, i.e., 


|B,| <C(n = 1, 2, ...). 
Then the vector series 


¥ a,b, (11) 


is convergent. 


Proof. Applying the estimate (10) to the “section” 


> OO, 


n=p 


of the series (11), we get 


q 
Y: a,5,| <2C,a,, (12) 
=— 


an 
where 


Cy= sup |b, + +5,]= sup [(by +o +5,) — (by +o +p) 
psrsq psrsq 
= sup |B,—B,_,|<2C. 


psrsq 


But the right-hand side of (12) approaches 0 as p > © for any g 7 p. The 
convergence of (11) now follows from the Cauchy convergence criterion 
(Theorem 6.43). 


d. Leibniz’s test (Theorem 6.23) is a special case of the Abel-Dirichlet test, 
obtained by setting R,, = R,; = R (m= 1) and 


B,=1, By =0, Bz = 1, By =9, ... 


But the Abel-Dirichlet test has a wider range of application than Leibniz’s test, 
even for m = 1. For example, consider the real series 


Le skaind, (13) 

ye cos nO (14) 
and the complex series 

Ya (cos nO +2 sin n8), (15) 


where o,, “ 0. To find the values of 0 for which these series converge, we argue 
as follows. Let 


z=cos 8+ isin 0. 
Then |z| = 1 and 
z"=cosné+i sin n@ 


(Sec. 5.73). Moreover 


gntt 


n n | es 
Y; (cos 0 +i sin k9) = ¥° z*= 
k=0 k=0 


= Z 


(cf. Example 6.11b), and hence 


2 2 
“1-2 ~ [1 —cos §—i sin 0| 


2 [ 2 
——_—., 16 
~ J (1—cos 0)? +sin? 6 \/1—cosé (16) 


> (cos k@ +i sin k0)| < 


Thus the sequence 


. (cos kO +i sin k0)(n = 0, 1, 2, ...) 
k=0 
is bounded (in absolute value), provided that 6 0, +27, + 4n, .... Applying the 


Abel-Dirichlet test, we see that the series (15) converges if a, 0 and 0 #0, + 


2n, + 4a, ... The series (13) and (14) converge under the same conditions, since 
they are just the real and imaginary parts of the series (15). Note that the series 
(13) also converges if 0 = 0, + 21, + 4a, ..., since then its terms all vanish. 


e. The estimate (16) implies two analogous estimates “in the real domain”: 


2 
k = 
» 7 |< 1—cos 9” 
. 2 
k es 
2 - i|< 1—cos 0 


These estimates often figure in problems involving the Abel-Dirichlet test, 
f. For the special case where 
=P 1), 
it is easy to write explicit expressions for the sums of the series (13)-(15). In 
fact, summing the appropriate geometric series, we get 
y re in® _ l 


Taking real and imaginary parts of the right-hand side, we get 


1 | j ~re"* _ 1—rcos 6+ir sin 0 
l—re®  (1—re®)\(1—re“®) 1 —2r cos O47? © 


It follows that 


— l—rcos 0 


“ n = R, n ind _ . 
Py am ay 1 —2r cos +r? 
i+ ©] eo ‘ Ly 0 
* sin no=l Te r sin . 
ry sin n 5 e 3 an 


6.48. Two-sided series 
a. Let 
veeyg A yyy -ey A_ 1, AQ, A], -.-5 a 


not 


be a “two-sidedly infinite” sequence of vectors in the space R,,. Then by the sum 
of the “two-sided series” 


x a, (17) 


k=-c@ 


we mean the limit (provided it exists) of the “two-sided partial sums” 
ye =e 2e ant = O01, 2a), 
where v and n approach infinity independently. More exactly, the series (17) is 


said to be convergent (or to converge), with sum s, if, given any € > 0, there 
exists an integer N 7 0 such that 


<eé (18) 


for all v,n7 N. 


b. The following theorem reduces the problem of the convergence of a two-sided 
series to that of the convergence of two onesided series: 


THEOREM. The two-sided series (17) converges if and only if both “onesided 
series” 


its 


a; 2a (19) 


k 


converge. Moreover, if the series (19) have the sums A and B, respectively, then 
the series (17) has the sum A + B. 


Proof. Suppose the series (19) converge, with sums A and B. Then, given 


any ¢ 7 0, there is an N such that 


A- ao 


k=0 


v 


B-Y a, 


k=1 


< 


ee 
2” 


nol] & 


for all v, 1 7 N. But then 


(4- $.a,)+(s- $e.) 


and hence (17) converges, with sum 4 + B. 
Conversely, if the series (17) converges, then, by (18), there is an N such that 


Ball 24)-(24) 


q 
k=pt+l 
for all g 7 p 2 N (v sufficiently large). Therefore the first of the series (19) 
converges, by the Cauchy convergence criterion (Theorem 6.43). A similar 
argument shows that the second of the series (19) also converges. 


(A+B)- > a, 


=—v 


<2e 


c. Example. The two-sided series 


co 
bs ce 
k=-@ 


converges for all real ¢ if 
a 
x |c,| <0. 

k=-@ 


The series converges everywhere except possibly at the points ¢t = 0, +27, +4n,, ... 
ifc, 0, c_,  0.as n — © (cf. Sec. 6.474). 


6.49. Symmetric summation of two-sided series 


a. The two-sided vector series (17) is said to be symmetrically summable if its 
“symmetric partial sums” 


5, = y a (n=0,1,2,...) (20) 


k=—n 


approach a limit s as n — oo. The limit s (if it exists) 1s called the symmetric sum 
of the series (17). 

If the series (17) converges in the sense of Sec. 6.48a, then it is obviously 
symmetrically summable, and its symmetric sum coincides with its ordinary 
sum. But not every symetrically summable series converges in the sense of Sec. 
6.48a. for example, the two-sided series with terms 


| 1 ifk>0, 
a,=% Oifk=0, 
~1ifk<0, 


is symmetrically summable, with symmetric sum 0, but does not converge in the 
sense of Sec. 6.48a. 


b. Symmetric summation of two-sided series also reduces to ordinary “‘onesided 
convergence”’: 


THEOREM. Zhe two-sided series (17) is symmetrically summable if and only if the 
onesided series 


tot ¥ (a4+0-1) (21) 


converges. Moreover, if (21) converges, then its sum is the symmetric sum of 


(17). 


Proof. An immediate consequence of the obvious equality of the nth partial sum 
of the series (21) and the nth symmetric partial sum (20) of the series (17). ff 


c. Symmetric summation is often used to study trigonometric series of the form 


For example, the series 


tee nee hp eee fhe (22) 


converges for all ¢ 0, + 27, + 4n, ... (cf. Example 6.48c), but not for t = 0, +27, 
+4n, ... because of Theorem 6.48b, since then the first of the series (19), say, 
reduces to the divergent harmonic series. However, the series (22) is 
symmetrically summable for a// real t, since the series 


o 7 & . 2 sin kt 


=2y 


converges everywhere (see Sec. 6.474). 


6.5. Series of Functions 


6.51. Series of functions have already been encountered in Secs. 6.47 and 6.49. 
We now discuss such series officially. 


Let 


a (x) +42(x) +++» +4,(x) +++ (1) 


be a series of functions, all defined on some set F and taking values in the space 
R,,- In keeping with Sec. 5.91, we say that the series (1) is convergent on E if the 
sequence of partial sums 


54(*) =a,(x), 

$9 (x) =a,(x) +a2(x), 

iss (2) 
Sa(*) =a, (x) +42(x) +++ +4,(%), 


of (1) is convergent on E. The limit 


s(x) = lim s,(x) 
is then called the sum of the series (1). By the same token, we say that the series 


(1) is uniformly convergent on E if the sequence of partial sums (2) is uniformly 
convergent on E (see Sec. 5.93). 


6.52. For series of functions we have the following version of the Cauchy 
convergence criterion: 


tHEoreM. The series (1) is uniformly convergent on E if and only if given any & 7 
0, there exists an integer N7 0 such that 


SUP [4m 41 (X) +++ +4,(%)| <é 


xeE 


for alln? m2 N. 


Proof. An immediate consequence of Theorem 5.96, with M taken to be the 
space R,,, with the usual norm |°:-*|. 


6.53. The next theorem gives a simple sufficient condition for uniform 
convergence of the series (1): 


THEOREM. (Weierstrass’ test). /f 


sup |a,(x)| =a, 
xcE 


and if the numerical series 


converges, then the series (1) is uniformly convergent on E. 


Proof. An immediate consequence of the preceding theorem, together with the 
estimate 


SUP [Gm 4 1(*) +2°* +4q(X)| SO yy te +4, 
xeE 


and the Cauchy convergence criterion for a numerical series (Sec. 6.11c). ff 


6.54. a. rueorem. Jf the series (1) converges uniformly on E and if every term 
a,(x) is bounded on E, then the sum s (x) of the series (1) is also bounded on E. 


Proof. An immediate consequence of Theorem 5.94. ff 


b. rueorem. [f the series (1) converges uniformly on E, where E is a metric space, 
and if every term a,(x) is continuous on E, then the sum s(x) of the series is also 


continuous on E. 


Proof. An immediate consequence of Theorem 5.95a and its corollary. & 


6.6. Power Series 
6.61. By a power series we mean a series of the form 
dg +4, (Z— 29) +42(Z— Zp)? +++ +4,(Z— 29)" +"; (1) 


where z = x + iy, Z79 =Xp9 + ivp and the coefficients dp, a), ..., d,, ... are complex 


numbers.t By the region of convergence of the power series (1) is meant the set 
of all values of z for which (1) converges. 


6.62. The following theorem plays a key role in determining the region of 
convergence of a given power series: 


THEOREM. (Cauchy-Hadamard). Let 


== lim’ la,|» 


n—- oO 


where the upper limit is taken in the extended real number system. Then the 
series (1) converges (infact absolutely) for all z such that |z — Zo| < p and 


diverges for all z such that |z — Zo| ot 


Proof. Suppose p is finite and nonzero, and let 


|z—Z|<p= 


] 
lim’/|a,| 
Then 
|z —zo|lim %/|a,| =lim $/Ja,||z—zo|"<1, 


so that the series (1) is absolutely convergent by Cauchy’s test (Theorem 6.14c) 
and Theorem 6.44. On the other hand, if 


l 
= >p= : 
l2—2ol> P= pees TT 
then 


|z—zo|lim*/|a,| =lim %/Ja,||z— Zo|"> 3 


so that the series (1) diverges since its terms cannot approach zero as n — © (cf. 
Sec. 6.43). The proof for the cases p p = 0 and p = is left as an exercise for the 1 


6.63. Theorem 6.62 shows that the region of convergence G of the power series 
(1) is a disk of radius p. Hence p is called the radius of convergence of the series 
(1). More exactly, G contains every point inside the “circle of convergence” [= 
{z : |Z — Z| = p} and no points outside I’. The theorem says nothing about the 


convergence of the series (1) on the boundary of the disk, i. e., on I’ itself, and in 
fact various possibilities can occur, requiring a special investigation in each case. 


6.64. Examples 


a. The series 


converges for all z, since 


l 
> = lim 575 = lim= =0. 
nao vj 2 n+ Mt 


b. The series 


converges only for z = 0, since 


-= lim?/n" = lim n= 00. 
Pp no na 


c. The series 


where a is a fixed real number, has radius of convergence 1, since, by formula 
(26), p. 154, 


l ] | + 
-=lim; ; = tim (=r) =], 
p n 


n— oo n— 00 


However, the series does not converge at any point of the circle |z| = 1 if a 2 0, 
since its terms then fail to converge to zero. If a < — 1, the series is absolutely 
convergent at every point of the circle |z| = 1, by Example 6.15a, while if— 1 Sa 


< 0, the series diverges for z = 1 but converges for all other z # 1, |z| = 1, by Sec. 
6.47d. 


6.65. Suppose the power series (1) has radius of convergence p. Then, in general, 
the convergence on the disk |z — zo| <p fails to be uniform. For example, if the 


geometric series 
1 + Z + 72 + eee 


converged uniformly on the disk |z| < 1, then its sum would be bounded on |z| < 
1, by Theorem 6.54a. But the sum of the geometric series is the function 


l 
l—z 


(cf. Example 6.11b), which is obviously unbounded on the disk |z| < 1. 


Nevertheless we have the following 


a. THEOREM. The series (1) is uniformly convergent on every disk \z — zo| <p, all 
Proof. For |z — Zo| = Pp, we have the estimate 
|a,(z rer Zo)"| < la,|p4 . 


But the series with terms |a,|Pi converges, by the Cauchy-Hadamard theorem 
(choose z = Zp + p,). Therefore the series (1) is uniformly convergent on the disk 
Iz — zo| Sp, by Weiers trass’test (Theorem 6.53). | 


b. corottary. The sum s(z) of the series (1) is bounded and continuous on every 
disk |z — zo| Sp, <p. 


Proof. An immediate consequence of the preceding theorem and the theorems of 
Sec. 6.54. li 


It follows that s(z) is continuous on the whole disk |z — Zo| < p, since every 
point of this disk lies in some smaller disk |z — zo| Sp,. However, s(z) need not 
be bounded on the whole disk |z — zo| < p, as we have just seen. 


6.66. As we know, a power series with radius of convergence p may or may not 
converge at points on the boundary of its region of convergence, 1.e., at points of 
the circle |z — zpl= p. However, if the series converges at a boundary point Zz), 


then it converges uniformly on the whole segment going from the center of the 
circle to the point z,. To see this, we need only consider the case zp = 0, z, = t, 7 


0 (why?): 


THEOREM. If the power series 


a,t” 


ime 


n 
converges at the point t, > 0, then it converges uniformly on the whole interval 0 
S¢Sp. 


Proof. The theorem will be proved if, given any ¢ 7 0, we succeed in finding an 
N such that 


for all g 7 p 2 N and all 0 S¢ St. Applying Abel’s transformation (Theorem 
6.47a), with 


t k 
a, = (=) ’ b, =a,t4 (k=p+l,...,9), 
1 


where for the time being p and q (p < q) are arbitrary, we get 


8-0" - OL} 


Let N be such that 


y ath 


k=pt+1 


— 
2 


for alln 7 p 2 N. Then 


q-1 


q & t n+l t n 
<3(*) +— max bY (*) -(-) 
2 ty 2 o<e<t, n=pt! ty ty 
& & g\F* (*)"] é 6 
&-+=- max +) =f Sstr=e 
2 5 amax | (¢ ty a @ 


forallg7p2N. If 


6.67. Examples 


a. The series 
z 2 z 
lta atte | =6keeh 


converges for all z # 1, |z| = 1, by Sec. 6.47d. By Theorem 6.66, the series 
converges uniformly on every radius of the disk |z| < 1 going to a point z,; such 
that |z,;| = 1, z, # 1. However, it must not be thought that the series converges 
uniformly on the set of points belonging to all these radii! 


b. The series 

yr (A+ 1)---(H+n—1) nt 
n=1 B(B+1)---(B+n—1) 
converges for z = 1 if 8 7 a + 1, by Example 6.17b. Hence, if 8 7 a + 1, the 
series converges for |z| <1 and converges uniformly on the interval 0 Sx <1. If 


B > a, the series converges for z =— 1, by Example 6.24c, and hence converges 
uniformly on the interval — 1 <x SO. 


(,B #0, —1,—2,...) 


6.68. Theorem 6.66 has the following important implication: 
THEorem. (Abel). Let f(x) be a function continuous on the closed interval 0 <x S 
x, such that 


fle) = Fae" 


n=0 


on the half-open interval 0 <x < x,. Suppose further that the series 


Y a (2) 


converges. Then 


flx,) = ¥ as. 


n=O 


Proof. Let 
¥ a,x" =5(x) 
n=0 


for all x © [0, x]. Since (2) converges, the series (3) is uniformly convergent on 
[0, x], by Theorem 6.66. Hence s(x) is continuous on [0, x,], by Theorem 6.54b. 
But ue is also continuous on [0, x,], by hypothesis, and moreover f(x) = s(x) for 
0 <x <x. It follows that 


flx,) = lim f(x) = kim s(s) =5(4) = Daye i 


xx, xXx, 


Problems 


1. Prove that the series 


n=1 
diverges if 


1 
a,>0, itt ion tee, 5.1<C (n=1,2,...). 


a, 


(Gauss) 
2. Prove that the hyper geometric series 


a(a+l1)-. Seale as ‘(B+n—1) , 


PXaBiyes) “1435 !y(y+1)-++(y+n—1) 


(a, B, y #0, — 1, — 2, ...) is absolutely convergent if |x| < 1 and divergent if |x| 7 
1. Prove that for x = 1 the series is (absolutely) convergent if y 7 a + £ Te 
divergent if y <a + £, while for x = —1 the series is absolutely convergent a yr? 

a + B, conditionally convergent if a+ 6-1 <y<a+ f, and divergent if y <a + 


B-1. 


3. Prove that if a,, 2 An+] > (n= 1, 2, ...) and if the series 


is convergent, then a, approaches zero faster than 1/n, L.e., 


lim na, =0. 
n- 0 


Comment. The function 1/n cannot be replaced by any function g (n) 
approaching zero faster than 1 /n. (A. S. Nemirovski) 
4. Suppose the series 


Ya,  (a,>0) 


converges. Prove that there exists a sequence b,, satisfying the conditions 


b, Sb, $+ Sb, Ss 


9 


lim b, = 00 
nc 


such that the series 
@ 
i a,b, 


also converges. 
5. Prove that multiplying the convergent series 


] l l 
eo + ——gp ==> ge + ees 
v2 v3 V4 
by itself gives a divergent series. 
6. Prove that for the series a; — a, + a3 — ag + °°: figuring in Leibniz’s test 


(Theorem 6.23), the difference between the sum of the entire series and the sum 


of the first n terms is less than the (n + 1)st term in absolute value if a, 7 a, --- 7 
Me ics 


l 


ay 


7. Prove that the radius of convergence of a power series 


a, (Zz ire Zo)” 
=0 


for which 


a 


n--co a, 
equals 1/r. 
8. Prove that if a, 7 0 (n=1, 2, ...) and if 


fit)=Yat"<C  (0<t<1), 


9. Given numbers a,, 2 0 and 5, such that 


a, = b(n, k= 1, 2,...), 


(n= L 2yeve)s 


& 
3 
ro 
\| 
w& 
- 
Pint 
— 
II 
——- 
7) 
w 
~—” 
3 


Ms iMs 


~ 
il 
bo 
=) 
= 
Il 
“ 


Prove that 


s= lim 5. 

k—-o0 
10. Unfinite products). Given a sequence of complex numbers Z), 25, ..., Z 
we say that the “infinite product” 


noe 


@ 
I] Zy = Z4Z7°°* Zr (1) 
k=1 
converges if the sequence of “partial products” 


P,= [] 2 =2122°"°Z, (n = 1,2,...) 
k=1 


converges to a finite limit P # 0. The number P is then called the value of the 
product (1). 
Prove that if (1) converges, then 

lim z,=1, 
n+ 
11. (Continuation). Given that x, > 0, X5 7 0,...,X,7 0 
product 

io) 

IT « 


k=1 


= , «5 prove that the infinite 


converges if and only if the series 


by log, Xk 
k=1 


converges (for an arbitrary base b 7 1). 
12. (Continuation). Let x, = 1 + w,, where all the w, (k= 1, 2, ...) have the same 


sign. Prove that the infinite product 

I] (.+,) 
k=1 
converges if and only if the series 

2) 

L % 
converges. 

Comment. The stipulation that all the w, have the same sign cannot be 

dropped (see Chapter 8, Problem 15). 
13. Prove that 


~ 
il 
~ 


(1+x)(1 +27) (1+) (1429). (L4+22"")-. = 


if |x| <1. 
14. Prove that 
] 
-y = 


limawe an a= 


where x 7 1 and P1> Pr» ++» Py» + the sequence of all prime numbers greater than 
1 arranged in increasing order. (Euler) 
15. Prove that the series 


oo 


(p, prime) 


ims 


l 
1P, 
diverges. 


16. Let qy, >, ..., gn, ... be the sequence of all natural numbers whose decimal 
representations (Sec. 1.77) contain no nines at all. Does the series 


a | 
n=1]p 


converge? 
17. Let 


My Fate +8, +o (u, € R,,) (2) 


be a series of vectors in the Euclidean space R,,. A vector p © R,, is called a 
vector of absolute convergence of the series (2) if the numerical series 


(p, uy) + (p, uy) + (D, Uy) +... 


where (p, u,,) denotes the scalar product of the vectors p and wu, is absolutely 
convergent. Prove that the series (2) is absolutely convergent if and only if every 
vector p © R,, 1s a vector of absolute convergence of (2). 

18. (Continuation). A vector q © R,, 1s called a vector of absolute divergence of 
the series (2) if the terms of (2) lying in every “solid angle” containing g form an 
absolutely divergent series. Prove that the series (2) is absolutely divergent if 
and only if it has at least one vector of absolute divergence. 

19. (Continuation). Let j;, ..., j,, ... be an increasing sequence of natural 


numbers, and let k,, ..., k, 
increasing order (j, S--- Sj, 5-54, S Sk, S 


, .. be the remaining natural numbers arranged in 
--+), Then the two series 


and 


a 


made up of terms of the series (2), are called complementary parts of (2). Prove 
that if one of the complementary parts of a convergent series (2) converges, then 
so does the other, and any rearrangement of the terms of (2) which leaves all the 
j, tn order and all the k, in order does not change the sum of (2). 


20. (Continuation). Prove that if the series (2) is conditionally convergent and if 
q is a vector of absolute divergence of (2), then there is a part uj toe ty to 


of the series (2) such that the components of the terms Uj» vay Uj y along g form 
an absolutely divergent series, while the components of Uj y assy Up goo orthogonal 


to g form an absolutely convergent series. 
21. (Continuation). Prove that if every nonzero vector e © R,, 18 a vector of 


absolute divergence of a conditionally convergent series (2), then every vector f 
R,, 1s the sum of a series obtained by suitably rearranging (2). 


22. (Continuation, Steinitz’s theorem). Given any conditionally convergent 
series (2), prove that the set of sums of all possible rearrangements of (2) is a 
subspace of R,,, orthogonal to the subspace A of all vectors of absolute 


convergence of (2) and “shifted by” the (uniquely determined) sum of the 
projections of the terms of (2) onto 4. 


+ The terms “series” and “infinite series” are synonymous, but we will consistently omit the word 
Ges - 99 
infinite. 


+ Also known as the ratio test. 
t Also known as the root test. 
+ Equivalently, either both series (12) and (13) converge or both diverge. 
+ A number a is said to be the harmonic mean of two numbers b and c if 


ae Ss ae 

s=2(5t2) 

+ Here we allow our series to have the sum — o or + . Such series are, of course, divergent series of a 
special kind (cf. Sec. 4.61). 

+ Here, as usual, |---| denotes the length of the vector written between the vertical bars. 


+ One can also consider real power series of the form 


ag + ay(x — x9) + ag(x x0)" baa ay) a 
(x, XQ, 40, X], -.-» dy real), but the complex case is of particular interest. 


t We set p = Oif lim Va, = oo and p = o if lim ¥Ja,] = 0. In the latter case, the theorem asserts that (1) 
converges for all complex z. 


+ Given two vectors p, g € Ry, the vector 


py = (p,9) 
ae 


is called the component of p along q (or the projection of p onto q), while the vector p | =p —p 1] is called 


q 


the component of p orthogonal to q. Note that <p} L> g>=0, (p,q) =9. 

t A subset S = R,, is called a (linear) subspace of Ry if x + y © S, ax © S for all x, y © S and all real a. 
Two subspaces S, ER, are said to be orthogonal if (x, y) = 0 for all x Es, y € 7. Given any vector p E Ry 
and any subspace S © Rap p has a unique representation of the form p = ps + p | , where ps © § and pl is 
orthogonal to S in the sense that (p | , x) = 0 for all x Es (prove this). We then call p, the projection of p 


onto S. 


7 The Derivative 


7.1. Definitions and Examples 


7.11. Given any (finite) real function defined on an interval (a, b) and any point 
X9 in (a, b), consider the “difference quotient” 


I (xo +h) —f(*o) 
th) Slo). (1) 
h 

where the point x) + h also lies in (a, b). Suppose (1) approaches a (finite) limit 
as h — 0. Then the function f(x) is said to be differentiable at x = xp, and we 
write 

. +h)—fl(x P ’ \ 

pin Le TH IFO) fee) = LP) Mons (2) 

h-+0 h 
The number /"(xq) is called the derivative of the function f(x) at the point x = x9, 
and the operation leading from f(x) to /(xo) is called differentiation. 


Geometrically, the quantity (1) is just the slope of the chord intersecting the 
graph of the function y = f(x) in the points with abscissas xg and x) + h (see 


Figure 10). The expression (2) is the slope of a certain straight line going 
through the point (x9, vo). This line (shown in the figure) is called the tangent to 


the graph of the function y = f(x) at the point with abscissa xp. 


7.12. tHeorem. Jf the function f(x) has a derivative at x = Xo, then f{x) 1s 
continuous at x = Xp. 


Proof. As h — 0, the quotient (1) is bounded by some constant C. Hence 
If (%o + A) —f (x) S CAI 
for all sufficiently small ||, which implies the continuity of (x) atx=xy. I 


7.13. Basic rules for calculating derivatives 


THEOREM. Suppose the functions f(x) and g(x) are differentiable at the point x = Xo, 
and let k be an arbitrary real number. Then the functions f(x) + g(x), kf(x), f(x) 
g(x), and fix) / g(x) are also differentiable at x = x9 (provided g(x) # 0 in the last 
case), 


Figure 10 


with derivatives 


[AS (*) x=x9 ='f (Xo), (3) 

LS (x) +(x) Jr=xo =F (%0) +8 (Xo); (4) 

(f(x) 2(*) eax =f" (*0) 8 (%0) +/(%0) 8’ (%0); (5) 

[a2 ].,, ~Zeeeel fede) % 
g(x) x=Xo g (Xo) 


Proof. To prove (3) and (4), pass to the limit as A — 0 in the identities 


kf (xo +h) — kf (xo) _ ,f(%o +h) —f (xo) 
h h . 
[f(xo +h) +2(%o +h)) —[/(%o) +8(%0)) _ fl%o +4) —S(%o) 
h h 
|, £l0+h) —2( x0) 
h > 


using the familiar rules for calculating limits of sums and products (Theorems 
4.36a and 4.36b). To prove (5), pass to the limit as / — 0 in the identity 


S(%o +h) g(%o +h) —S (x0) 8(%o) 7 S (Xo +A) a (xo +h) — f(x0) 8 (Xo +A) 
h h 


Lo) 8% +h) — f(x) 8(%0) 
h 


S(%o +h) —f(%o) 
h 


8 (Xo +h) —8 (xo) 
1 cc one 


= g(x +h) 


using the same rules and Theorem 7.12. Once having proved (5), we need only 
prove (6) for the case f(x) = 1, passing to the limit as / — 0 in the identity 


l 1 8 (X%y +h) —g (Xo) 


B(xot+h) Blo) 8(xo)g(xo+h) ” 
with the help of Theorem 7.12 and the rule for calculating the limit of a 
reciprocal (Theorem 4.36d). 


7.14. Examples 


a. The constant function f(x) = c is obviously differentiable everywhere, with 
derivative 0. 


b. The function f(x) = x is differentiable everywhere, with derivative 1. 
c. Given any polynomial 


P(x) = agx" + a,x"! eee 


and any rational function 


P(x) agx"+a,x"~14+---+a, 
Q(x)  box™+b,x"" +++ +5, 


it follows from Theorem 7.13 that P(x) is differentiable at every point x = xo, 


while R(x) is differentiable at every point x = x) such that O(xp) # 0. The 


derivatives of P (x) and R(x) are easily calculated by using (3)-(6), together with 
the formula 


(x")! = nx"! (7) 


To prove (7), we use induction (see Sec. 1.43), noting that (7) obviously holds 
for n = 1, while if (7) holds for an integer n, then, by (5), 


(x? Y= (= x)! = (Vx $x) = axe x tx = (0 + Dr”, 


d. Similarly, it follows from (5) and mathematical induction that the derivative 
of a product of n functions is given by 


(Ai (*)f2 (4) fa) =A fo (*) Sa) 
+f (*)f2 (*)--fal®) +2 ti @A(*)- Sal). (8) 


e. Consider the determinant 


Usy(X)  Uya(x) + Wy, (x) 
W(x) = Uz 1 (%) ead ~ alt? ; 
Un (x) Un» (x) di Unn(X) 


made up of n* differentiable functions uj(x). By definition, W(x) is the sum of n! 


terms with appropriate signs, each term a product of n factors, one from each 
row and each column of the determinant.+ Using the rule (8) to differentiate each 
term and collecting first the terms differentiated with respect to the factors in the 
first column, then the terms differentiated with respect to the factors in the 
second column, and so on, we get the formula 


wyy(x) — yo(x) s+ Uy a() 


W(x) = Uyy(%) M2 (X) s+ Uy () 
Unt (x) Un? (x) see Ung (x) 
Miy(*) ya (x) ves yy (*) 
a |) up2(x) +++ u2,(*) 
Unt (x) un» (x) aa Unn(X) 
Uyy(X)  ya(x) oe uh, (*) 
eae a [tart Uz2(x) + Us n(*) | 
Un (x) Un (x) wee Unn(X) 


7.15. a. rHeoreM (Differentiation of a composite function). Suppose y = f(x) is 
differentiable at x = x9, while z = g(y) is defined on an interval containing the 


point Vo = f(x) and is differentiable at y = yo. Then the composite function 
z= h(x) = g(fx)) 
is differentiable at x = x9, with derivative 
h' (x9) =8' (yo) f'(*o)- (9) 
Proof. By the definition of the derivative, 


IIo =H (x) —f(*0) = LF (Xo) + €(*) ](%— 20), 
&(9) —8 (90) =[8' (90) +9(9)] (9-0), 


where e(x) — 0 as x > Xp, 6() — 0 as y > yo, and hence 


h(x) — A(x) =8(S(x)) —g(f(%o)) =e (9) —8 (0) 
=[g' (90) + 9(9)] (9 —I0) 
=[2' (90) + (DX) ILS’ (xo) +8(*)] (x —%o).- 


AS xX — Xo, (x) — 0 and moreover y — yo, by the continuity of y = f(x) at the 
point x = x, (Theorem 7.12), so that d(v) — 0. To get (9), we now take the limit 
as X — XQ in the identity 


h(x) —h(xo) _ 


X—Xo 


[g'(90) +O(9) ILS" (*o) +e(*)]- FT 


b. Example. To find the derivative of the function 
z=(1 +x)”, 
we need only write z as a composite function 


z=yy =1+?, 


We then have 


z= 9978 y' = 2x, 


and hence 
z' = 99y?8 - 2x = 198x(1 + x7), 


by the preceding theorem. It would be the height of folly to expand (1 + x)? by 
the binomial theorem and then use Theorem 7.13 directly. 


7.16. tHeorem (Differentiation of an inverse function). Let y = f(x) be 
continuous and increasing on an interval (a, b), with inverse x = (y), and 


suppose y = f(x) is differentiable at x = xo © (a, b), with derivative /’ (x9) # 0. Then 
the function x = p(y) is differentiable at y = yo = fix), with derivative 


] 
F(X) 
Proof. Since 


PV) =x, GO) =X, YV=Nx), Y=AX%X)- 


where y lies in some neighborhood of yo, we have 


Y'( Jo) = 


GC) — Pl Jo) _ ¥—%0 _ . (10) 


I-Jo = IIo S(*) =F(%0) 


X—Xo 


But gy) is continuous at the point y = yp (see Theorem 5.35b), and hence y — yo 
implies x — xg. Therefore, taking the limit of (10) as y > yo, we get 


(9) — P(I0) l l l 


0’ (9) = lim pad aes yea aie a, 
yo Eee x+x0 J —Jo 
x —Xo x—-Xo x —Xo 


7.17. Differentiation of the logarithm and related functions 


a. Let y = log x. Then, for xy 7 0, 


ad ijh ijh 
loge (t+) —l08s (#0) _ og, (+4) =log, (1+ =) 
h Xo x 


0 


= joe, (142) = 2/8 ‘t/a, 


Xo Xo h/Xo 
where the expression in brackets approaches log, e as h — 0 (see Theorem 
5.58c). It follows that 


log, é 


(log, *),=x5= Xo , 


This formula takes a particularly simple form if a = e, in which case we are 
dealing with natural logarithms (Sec. 5.58c): 


(In 2)'m (11) 


b. To differentiate the function y = In (—x), defined for x < 0, we use Theorem 
7.15a, obtaining 


] 


[In (—x)]’ = —(~a)'= (1) =2. (11) 


] 
x 
Formulas (11) and (11’) can be combined into the single formula 


] 
(In |xl)' =, (12) 


valid for all x # 0. 


c. The exponential x = @” is the inverse of the logarithm y = log, x, and hence, by 
Theorem 7.16, 


Replacing yp by x (— 0 Sx S 0), we get the formula 


(a*)' =a" Ina, 
which becomes particularly simple for a = e: 


(e)' =e. (13) 


d. The function 
y=", (14) 


where a is an arbitrary real number, is defined for all x 7 0 (Sec. 5.53). Writing 
(14) in the form 


= ealnx 


y 


and using Theorem 7.15a, we get 


yl etn SL yet, (15) 


x 


with the help of (11) and (13).7 For = 1, 2, ... this agrees with formula (7), as is 
to be expected. 


7.18. Differentiation of trigonometric functions and their inverses 


a. For the function y = sin x, we have 
; ; 7 h 
sin (x +A) —sin x=2 sin = cos x+ 5 


(Sec. 5.63), and hence. 


I 
sin 


t 
sin (x+h)—sinx _ 2. xat 
h h 2 
2 


But 


. sin 
lim —— =] 
x40 * 


(Theorem 5.64b), and hence. 


sin ~ 
i ‘ali 2 cos Prt. = lim cos ae 
(sin x) ey 3)" 5)? 


2 


which implies 
(sin x)'=cos x, (16) 
by the continuity of the function cos x (see Theorem 5.64a). 


b. It follows from (16) and the formulas 


7 Tl 4 ‘ 
sin (5 +) cos -. cos (F+s)=-sins 


(Sec. 5.65b) that 


(cos x)’ = sn ( +x)] =cos (5 +4) 


and hence 
(cos x)’ = —sin x. (17) 


c. Formulas (6), (16), and (17) in turn imply 


, [sinx\' cos? x+sin? x l 
(tan x)/= = —_, —_ = —_,, 
cos x cos? x cos? x 


or simply 
(tan x)’=sec? x, (18) 
provided of course that cos x does not vanish, 1.e., that 


2k +1 
Fs 2 


d. If 


z (km0,+1,+2....). 


x =are sin u,u = sin x(—1/2 <x <2/2,-1 Su <1), 
then, by Theorem 7.16,+ 
aces) aay ane flats 
L€., 
(arc sin u)’ = (19) 
Tu? 
e. Similarly, if 


x = are cos u,u = cos x(-m Sx £0,-1 Su <1), 


then 

(arc CO ee ee 
(cos x)’ sin x —,/1—cos? x 

1¢,, 


(arc cos u)’ = er (20) 


We can also get (20) by differentiating the formula 


arc Cos u=arc sin u-5 (21) 


(Sec. 5.67) with the help of (19). 
More exactly, formula (20) pertains to the “increasing branch” of arc cos u. 
For the “decreasing branch” we have 


arc Cos u= 5 —are sin u (21’) 
instead of (21) and 
l 
—_ 0’ 
(arc cos u) = (20’) 
instead of (20).+ 


f. Finally, if 


x = are tan u,u = tan x(— 2/2 <x < 2/2, -0 Su <0), 


then 


arc tan u)’ = ————- = cos? x= 7 
7 2 
(tan x) ]1+tan* x 
Le., 
| 
t ‘= ; 22 
(arc an u) lee ( ) 


As an exercise, the reader should find the derivatives of the remaining 
trigonometric functions (cot x, sec x, csc x) and their inverses. 


7.19. One-sided derivatives 


a. By the /eft-hand derivative of a function f(x), defined on an interval a <x S 
X9, we mean the quantity 


' . S (xo +h) —f(%o) 
Filo) slim 


(/ for “left”), provided it exists. Instead of f(x) we sometimes write f(x, — 0). 


b. By the right-hand derivative of a function f(x), defined on an interval x) Sx S 
b, we mean the quantity 


He.) Hmm fUXot 4) — flo) 
S (xo) =li ° h ° 


h\0 


(r for “right”), provided it exists. Instead of /’,(x9) we sometimes write f’ (xq + 0). 


c. Ifa function f(x) is defined on an interval (a, b) containing the point x9, we can 
inquire as to the existence of the “one-sided derivatives” /(xp). and f',(x9) 
Obviously, if the ordinary derivative f(x) exists, then both /',(x9) exist and 
satisfy the equality 


f (x0) =f'-Xo) =f’). 


But it may well happen that /(x9) and f(x) both exist, without f(x) existing 
(see Figure 11). On the other hand, if /(xp) and /’,.(%9) both exist and are equal, 
then /(xp) exists and equals the common value of f(x) and f(x). This follows 


at once from the corresponding theorem on “partial limits” (Theorem 4.16c). 
d. The ray defined by the equation 
Y=f(*o) +fi(%0)(*¥-%0) — (¥ S Xo) (23) 


is called the left-hand tangent to the curve y = f(x) at the point x = xp. Similarly, 
the ray defined by the equation 


Y=f (Xo) +F7(%o) (x —%o) (x2 Xp) (23’) 


is called the right-hand tangent} to the curve y = f[x) at the point x = Xp. 


e. Theorem 7.16 is easily generalized to the case of one-sided derivatives: 


Figure 11 


THEOREM. Let y = f(x) be continuous and increasing on an interval (a, b), with 
inverse x = ~(yv), and suppose y = f(x) has a left-hand derivative /’ (x9 — 0) # 0 at x 
=Xq © (a, b). Then the function x = @ (y) has a left-hand derivative 


] 
~ f'(%—0) 


at y = Vo = fix). 


?'(¥o0—9) 


Proof. The same as that of Theorem 7.16, except for the restriction that x xo, y 
Syl 


The theorem obviously remains true if we change “left-hand” to “right-hand” 
and — 0 to +0. 


7.2. Properties of Differentiable Functions 


7.21. We now turn to a study of general properties of differentiable functions. 


THEOREM. Given a function y = f(x), consider all linear functions with the same 
value as f(x) at the point x = x9, 1.e., all functions of the form 


In =A(x—X9) +f(%o) = AA+f(%o) (h=x—Xp). (1) 


Then y = f (x) is differentiable at x = Xp, with derivative f(x), if and only if there 
exists a linear function (1) such that 


y—ya=e(K)h — (h=x—%), (2) 
where e(h) > 0 as h — 0, ¢ in which case A = f'(Xo). 
Proof. If (2) holds, then 
I-94 _ SX) ~Ah=Hlto) _ f%0+) Ht) _ 4 _ 44) -s0 


as h — 0, and hence 
T' (xo) =lim I (*o +h) —f(xo) 
h+0 h 


exists and equals A. Conversely, if f(x) is differentiable at x), with derivative/’ 
(Xo), then 


a. —f' (x9) =e(h) +0 


as h — 0, and hence 
S(%o +) —[F' (*o)h+f(%0)] =e(A)A. (3) 
Noting that f(xg + 4) = f(x) = y and choosing 
y4=Ah + flxo) =f Xo)h + Axo); 
we get (2). | 


7.22. By definition (Sec. 7.11), the straight line in the xy-plane with equation 


=f" (*o) (*—*o) +f(*o) (4) 


is the tangent to the curve y = f(x) at x = xq (more exactly, at the point with 
abscissa x9). According to the above theorem, given any ¢ > 0, there is 6 <0 


such that || < 6 implies |e(h)| < ¢, where e(A)h is the deviation of the linear 
function (4) from the curve y = f(x). Thus 


—eh < flx) — f (xp)h — flxq) S eh 
if |h| <0, ie., 
Lf" (xo)h—e]h +f(%0) </(x) <Lf’ (*o) A +] kA +S (x0) (5) 


if |h| < 6. This has the following geometric interpretation: If f(x) is differentiable 
at x = xq then near x = x, the curve y = f(x) lies between two straight lines making 


arbitrarily small angles with the tangent to the curve at x = xq (see Figure 12). It 
follows at once that iff’ (xq) <0, there is ad 7 0 such that 


S(%o—h) > f(%0) > f(%0 +4) (6) 


whenever 0 <h <6, while if I (Xo) > 0, there isa d7 0 such that 


Figure 12 


Figure 13 


Figure 14 
S(%9 —h) < S(*9) < f(%0 +4) (7) 


whenever 0 <h <6. 


7.23. A function f(x) is said to have a local maximum at a point c © (a, b) if there 
exists an h 7 0 such that f(x) S f(c) for all x © (c — h, c +h). Similarly, a function 
f(x) is said to have a local minimum at a point c © (a, b) if there exists an h 7 0 
such that fix) 2 f(c) for all x © (c — h, c + h). If fix) has a local maximum or a 
local minimum at a point c, we say that f(x) has a local extremum at c. 

Figure 13 illustrates the geometric meaning of a local maximum, while Figure 
14 illustrates that of a local minimum. 


THEOREM. [f f(x) is differentiable at a point x = x9 where it has a local extremum, 
then f’(x9)= 0. 


Proof. if f'(xo) # 9, then f(x) cannot have a local extremum at x9, because of the 
inequalities (6) and (7). @ 


Thus to find the points at which an everywhere differentiable function f(x) has 
local extrema, we must analyze the equation 


I'(«) =0, (8) 


since the points in question must be among the solutions of (8). Note, however, 
that f(x) may well vanish at a point c where there is no local extremum, as for 
example in the case of the function f(x) = x° at the point c = 0. A more detailed 
analysis of extrema will be given in Secs. 7.5 and 8.4. 


7.3. The Differential 


7.31. It follows from formula (3) of the preceding page that the increment 


Ay = fixo + h) — Axo) 


‘ier S'(xo)h 


Figure 15 


of the function y = f(x), when the independent variable x changes from xg to x9 + 
h, is the sum of two parts, a part /'(x9)h which is linear in the increment h of the 
independent variable and a part e(h)h which is “of a higher order of smallness 
than h.” Geometrically, the first part is the increment of f(x) as measured along 
the tangent, while the second part is the difference between the “true value” (xo 
+ h) and the value as measured along the tangent (see Figure 15). The term / 
(xo)h is called the principal linear part of the increment Ay. Thus the existence 
of the derivative implies the possibility of separating out a principal linear part 
from the increment of the function, and conversely (this is the content of 
Theorem 7.21). 


7.32. Suppose f(x) is differentiable at x = xp, and let the quantity = x — xg be 
denoted by dx and the quantity 


f (xp)\h = fi (Xo)dx 


by dy(xo), or more briefly by dy or df. Then the quantities dx and dy are called 


differentials, the first the differential of the independent variable x, the second 
the differential of the dependent variable y (at the point x = x9). Note that the 


differential dx is just the increment / of the independent variable x, while the 
differential dy is just the principal linear part of the increment of the dependent 
variablej y. Thus dy is a linear function of dx. Using dx and dy, we can write f(x) 
as a ratio of differentials: 


7.33. The rules for calculating derivatives summarized in Theorem 7.13 lead to 
corresponding rules for calculating differentials. In fact, multiplying both sides 
of formulas (3) — (6), p. 224 by dx, we get 


d(kf ) =k df, (1) 
d( f+g) =df+dg, (2) 
d( fe)=fdgtg df, (3) 


a(é ) =£GSE (x9) #0). (4) 


g g° (xo) 


7.34. The differential of a composite function. Multiplying both sides of 
formula (9), p. 226 by dx, we get 


dz=h' (Xo) dx = g'(Vo)f' (Xo) ax = go) ay. 


But if y were the independent variable rather than a function of x, the differential 
of the function z would take precisely the same form 


dz = g' (Vo) dy. 


Hence the differential of a function does not depend on whether its argument is 
the independent variable or a function of some other independent variable. This 
“invariance property” can be used to differentiate composite functions. Thus the 
function (x* + 1)?” has the differential 


d (x2 + 1)99 = 99 (x2 + 1)%8 d(x? + 1) = 99 (x2 + 1)%8 2x dx, 


and hence 


PLP 


(x? 4199] = H - = 198x(x? + 1)°8, 


just as in Example 7.15b. 


7.4. Mean Value Theorems 


We now establish a class of results called “mean value theorems,” relating the 
values of a function at the end points of a closed interval to the value of its 
derivative at a suitable interior point of the interval. 


7.41. rHrorem (Rolle’s theorem). Suppose a (finite) function fix) is continuous 
on a (possibly infinite+) closed interval [a, b| and differentiable at every point of 
the open interval (a, b), and suppose further that f(a) = f(b). Then there exists a 
point c £ (a, b) such that f"(c) = 0. 


Proof. By a simple argument involving Weierstrass’ theorem (see Sec. 5.16c), 
there is a point c © (a, b) such that 


S(.) = sup f(x) 


a<x<b 

or 

flo) = inf f(x). 
a<x<b 


Hence f(x) has a local extremum at the point c. But then /(c) = 0, by Theorem 7.2 
E 


7.42. tHrorem. Let f(x) have the same continuity and differentiability properties 
as in Rolle’s theorem, and suppose f(x) # 0 for all x © (a, b). Then f (a) # f(b). 


Proof. An obvious consequence of Rolle’s theorem. ff 


7.43. rHeoreM (Cauchy’s theorem). Suppose two (finite) functions f(x) and g(x) 
are continuous on a (possibly infinite) closed interval |a, b| and differentiable at 
every point of the open interval (a, b), and suppose further that g'(x) does not 
vanish at any point of (a, b). Then there exists a point c © (a, b) such that 


S(6) -f/(@) _ f°) 
g(b)—g(a)  g'(c) 


Proof. First we note that g(b) # g(a), by Theorem 7.42. The function 
p(x) = fix) — Agr) 


(A any constant) has the same continuity and differentiability properties as the 
functions f(x) and g(x) themselves. Let 4A be such that g(a) = (5), as in Rolle’s 
theorem. Then A satisfies the equation 


fa) — Ag(a) = f(b) — Ag(d), 


and hence 
4 fO=fla) 
g(b) —g(a) 


Applying Rolle’s theorem to the function g(x), we find that there exists a point c 
© (a, b) such that 


8'(0)=0. (2) 


But (2) is equivalent to (1). 


7.44, rHrorem (Lagrange’s theorem). /f f(x) is continuous on a finite closed 
interval [a, b| and differentiable at every interior point of [a, b], then there exists 
a point c © (a, b) such that 

f(b) — f(a) 


aaa nae (3) 


J (a) | 
' ; i 
a c b 
Figure 16 


Proof. Choose g(x) =x in Cauchy’s theorem. ff 
Another version of (3) is the “finite difference formula” 
f(b) =f(a) +f" (¢)(6—a). (3°) 


The geometric meaning of the point c, implied by (3) and shown in Figure 16, is 
simply that the tangent to the curve y = f(x) at the point with abscissa c 1s parallel 
to the chord joining the points (a, f(a)) and (6, f(d)). 


7.45. a. tHeoreM. The function f(x) is decreasing on [a, b] if f(x) <0 for all x © (a, 
b) and increasing on [a, b] if f(x) 7 0 for all x © (a, b). 


Proof. Choosing a= x', b =x" in (3'), where [x', x"] = [a, b], we get 


fla") = fla’) +(x" — xf (0) Sf’) 


if f(c) <0 and 

fl") = fla’) + (&" = xf) 7 fe’) 

iff(c)70. Fi 

: tHeorem. The function f(x) is constant on [a, b] if f(x) vanishes for all x © (a, 


Proof. This time 

Ax") = fix’) + ("= x f'(c) = AX), 

since f (c)=0. I 

c. Example. Let 

fix) = sin x, f (x) = cos x. 

Then sin x is increasing on the intervals 
(2k—4)a<x<(2k+4)n (k=O, + 1,4+2,...) 

on which cos x is positive and decreasing on the intervals 
(2k+4)n<x<(2k+4)x (k=0,+1,+2,...) 

on which cos x is negative (see Sec. 5.65). Similarly, choosing 
fix) = cos x,f'(x) =— sin x, 

we see that cos x is increasing on the intervals 

(2k —1) a <x < 2ka(k = 0, +1, +2, ...) 

on which — sin x is positive and decreasing on the intervals 
Qka Sx S (2k + 1)a(k = 0, + 1, +2, ...) 


on which — sin x is negative. These results have already been found by other 
means (Sec. 5.65), but the present approach is much simpler. 


7.5. Concavity and Inflection Points 


7.51. Given a function f(x) continuous on an interval (a, b) and differentiable at a 
point x9 © (a, b), let y = T (x) be the tangent to the curve y = f(x) at the point 
(with abscissa) Xo, so that 


v= T(x) =f (Xo) & — Xo) +f Mo) 


(see Sec. 7.22). Then f(x) is said to be concave downward at xq if there is a 
deleted neighborhood of xg in which f(x) < 7(x), i.e., in which the curve y = f(x) 
lies below its tangent at xp (see Figure 17). Similarly, f(x) is said to be concave 
upward at X if there is a deleted neighborhood of x9 in which 


Ss 1 a al 


Figure 17 


a 


Figure 18 


Figure 19 


f(x) 7 T(x), i.e., in which the curve y = f(x) lies above its tangent at Xg (see Figure 


18). If fx) is concave downward (or upward) at every point x © (a, b), we say 
that f(x) is concave upward (or downward) on the interval (a, b). Finally, a point 
Xq 1S said to be an inflection point of the function f(x) if the curve y = f(x) lies on 


one side of its tangent (at xq) if x - Xq and on the other side of its tangent if x ” 
X9 (see Figure 19). 


7.52. a. Lemma. Let C be the curve with equation 
y=fixy(a Sx Sb), 
and let T be the tangent to C at the point xp. Then C lies above T to the left of xo 
(x < Xo) FF'(S) </ (x9) for all € © (a, x0), while C lies below T to the left of Xo iff 
(6) 7 f' (Xo) for all €© (a, x9). 
Proof. Choosing a = x0, b = x, c = & in the finite difference formula (3’), p. 239, 
we get 

F(x) =f(%0) +f" (€) (x — x0) >f(%0) +" (X0) (x —X0) = T (x) (1) 
(x — x9 <0) iff (Q) Sx), while x) < Mx) 70) iff 7 fo). E 
b. vemma. Let C and T be the same as in the preceding lemma. Then C lies above 
T to the right of xo (x a4 Xo) iff (4) adh (xo) for all 4 = (Xo, 5), while C lies below 
T to the right of Xo if f’ (n) <f'(%o) for all n © (xo, 5). 


Proof. Instead of (1), we now have 


fix) = fx) +f (n) (x = x0) 7 fx) +f Xo) & = Xo) = T@) 
(x= x9 7 0) iff (1) 7 fo), while fix) S T(x) < Te) iff) Sf). E 
7.53. a. THEOREM. The function f(x) is concave downward at xq if 


f(O7 feo) SM 


whenever a +& © X9 <n <b and concave upward at Xo if 


hf) <M) 


whenever a + é* Xo <9 <b. On the other hand, Xq is an inflection point of f{x) 
if 


f (© Sf Oo)S(n) 7 f Xo) 


or 


f (SS &).f ) S/o) 
whenever a SES xg <1 <b. 
Proof. An immediate consequence of the above two lemmas. ff 


b. rHeorem. The function f(x) is concave downward on (a, b) if f(x) is decreasing 
on (a, b) and concave upward on (a, b) if f(x) is increasing on (a, b). 


Proof. An immediate consequence of the preceding theorem. ff 


Further tests for concavity and inflection points, based on the use of higher 
derivatives of f(x), will be given in Sec. 8.3. 


c. Example. Let 

fix) = sin x,f'(x) = cos x. 

Then sin x is concave upward on the intervals 

(2k — 1) a <x S 2ka(k = 0, +1, +2, ...) 

on which cos x is increasing and concave downward on the intervals 
Qka Sx S (2k + 1)a(k = 0, +1, +2, ...) 

on which cos x is decreasing (see Example 7.45c). Similarly, choosing 
fix) = cos x,f(x) = —sin x, 

we see that cos x is concave downward on the intervals 

(2k —4)x <x < (2k + #)a(k = 0, +1, 42, ...) 

on which — sin x is decreasing and concave upward on the intervals 
(2k + 4)x <x < (2k + Balk = 0, +1, +2, ...) 

on which — sin x is increasing. 


7.54, tHeorem. Suppose f' (xo) = 0. Then f(x) has a local maximum at xo if f' (¢) ” 
07 f() 


whenever a +& < Xo <7 <b and a local minimum at Xo if 


f(OS0S/M) 


whenever a + &< Xo <4 Sb. On the other hand, f(x) has neither a maximum nor 


a minimum at Xo if 


f (© 7 0, (y)7 0 


or 


f (<0, (n) <0 
whenever a SES x9 <1 <b. 

Proof. If f'(xo) = 0, then the tangent T to the curve y = f(x) at the point xp is 
horizontal. Under these circumstances, if f(x) is concave downward at xo, then 
fx) < f(x) in a deleted neighborhood of x9, so that f(x) has a local maximum at 
Xo, while if f(x) is concave upward at xp, then f(x) > (xo) in a deleted 
neighborhood of x9, so that f(x) has a local minimum at x9. On the other hand, if 
Xq 1s an inflection point of f(x), then the curve y = f(x) goes from one side of the 
tangent T to the other as it passes through the point x9, so that f(x) can have 
neither a maximum nor a minimum at x9. The rest of the proof is now an 
immediate consequence of Theorem 7.53a, with f’ (xp) = 0. | 


7.6. L’Hospital’s Rules 


The next two theorems are often useful in evaluating limits. They apply to limits 
of the form 


° x 
im 
x\a £(%) 


where in the first case f(x) and g(x) both approach 0 as x sa, leading to the 
“indeterminate form” 0/0, while in the second case f(x) and g(x) both approach oo 
as x a, leading to the “indeterminate form” 00/oo. 


7.61. rHeorem (L’Hospital’s rule for 0/0). Suppose the (finite) functions f(x) and 


g(x) are continuous on the (possibly infinite) closed interval [a, b| and 
differentiable on the open interval (a, b).+ Suppose further that g'(x) for all x © 
(a, b), while f(a) = g(a) = 0. Then 


limo“) 4 (4eR) (1) 
x\a & (x) 

implies 
eT (2) 
x\a 8(x) 


Proof. Suppose first that — 0 SA Soo. Then, given any ¢ 7 0, choose Xq such 
that 


ce a 3 
ce ao ©) 


ifa<x*< Xo. By Cauchy’s theorem (Sec. 7.43) applied to the interval [a, 5], 
there is a point c © (a, x) such that 


Se 


It follows that 


fis 4| fO_, 
(2) ¥() 


<e 


if a <x xo, which implies (2). 

In the case where A is infinite, the inequality (3) is replaced by 
fa) st 
g(x) € 


or 


f(s) 1 


g' (x) é 


depending on the sign of A, and the rest of the proof is the same. ff 


7.62. THeorem (L’Hospital’s rule for 0/co). Suppose the functions f(x) and g(x) 
are continuous and differentiable on the (possibly infinite) open interval (a, b). 
Suppose further that g' (x) # 0 for all x © (a, b), while fla) = g(a) = ~. Then (1) 
implies (2). 


Proof. Suppose first that — 0 S A <0, As before, given any ¢ 7 0, choose xg 


such that (3) holds if a < x < x9. Defining a function D(x, x9) by the condition 


J (x) S(x) S(*o) D( 


g(x) g(x) —g(%o) 


¥j%6)s 


we see that 


g(x) —g (xo) 1 — 8 (%o) 


Dex) =__ £0) 8) 
0) = Fa) — flee) | _ Slt) 4) 
fe) fe) 


as x sa. By Cauchy’s theorem, applied this time to the interval [x, Xg], there is a 
point c © (x, Xq) such that 


£8) £0 pug SOLO 


g(x) g’(c) gi(c) g’(c) 


[D(x,x9) — 1]. 


Hence for all x such that |D(x, x9) — 1| < ¢ we have 


fe) _ gl <|£O_ 4 
g(x) g'(¢) 
which implies (2), since ¢ is arbitrarily small. 

In the case where A is infinite, say 4 = ©, the inequality (3) is replaced by 


+ LE | D(ss9) 11 <e('4l +0) (5) 


and the a (5) by 


J (x) > if ()} 
g(x) ?"7(0)2 3 


for all x sufficiently near a, because of (4). The case A = — 00 is treated similarly. 


There are obvious analogues of L’Hospital’s rules for the case where (1) and 
(2) are replaced by 


im’ eA (heR) (1’) 
x7b g (x ) 


and 


lim 


St) <4, (2') 
x7b &(x) 


Problems 
1. Suppose f(x) is defined on [a, b] and satisfies the inequality 
fx) — flay)|SCle, — x9|!*% (a. 7 0) 


for all x1, x5 © [a, b]. Prove that f(x) is a constant. 
2. Let f(x) be differentiable for all x 7 c, and suppose that 


lim f’(x) =0. 


x— a 


Prove that 


lim [f(x +) —f(x)] =0 
for any h 7 0. 
3. Suppose f(x) has a continuous derivative at every point of the interval [a, b]. 
Prove that f(x) is uniformly differentiable on [a, b] in the sense that given any ¢ 7 
0, there exists a 6 7 0 such that 


S(*2) —f(*1) 


X.—*y 


f'(*1)- <é 
for every pair of points x1, x5 © [a, b] with |x, — x5| <0. 
4. Prove that the function 


y=x? sin — 
x 


has a derivative y’ at every point of [0, 1], but y’ fails to be continuous on [0, 1]. 


5. (Darboux’s theorem). Prove that if f(x) is differentiable on [a, b], then / (x) 
takes every value between / (a) and / (5). 
6. Suppose f(x) is concave upward on (a, b), as defined in Sec. 7.51. Prove that 


f(0)(x—a) +f(a) (6—+) 


fiz) < < 


for all x © [a, b], ive., the curve y = f(x) does not go above the chord joining the 
points of the curve with abscissas a and b. 
7. Guided by the preceding problem, suppose that a function f(x) defined on a 
closed interval [a, b] has the property that 


\< S(B)(x—%) +f(@) (B—*) 


for all x in any interval (a, 8) < [a, b], so that the curve y = (x) does not go above 
the chord joining the points of the curve with abscissas a and f. Then f(x) is said 
to be convex on [a, b]. This generalizes the notion of a function which is concave 
upward, since we make no assumptions about the differentiability of f(x). 

Prove that if f(x) is convex on [a, b], then 


L(x (1) 


¥ > iy) < X Aj f(*;) 
= J = 
for arbitrary x1, ..., x, in [a, b] and arbitrary numbers /), ..., 2,, such that 


{).8$1 (= 

0 Aj 1G =1,..., 2) and 
y A=. 

j=1 


The notion of convexity is explored further in the next six problems. 
8. Consider the function 
ple) DS (a<a<x<b), 

x-—o 
i.e., the slope of the chord joining the points of the curve y = f(x) with abscissas 
a and f£. Prove that f(x) is convex on [a, b] if and only if p(a, x) is a 
nondecreasing function of x for every fixed a. 
9. Prove that a function f(x) convex on [a, b] is continuous on (a, b) and has 
finite left-hand and right-hand derivatives f’(x) and /’,(x) at every point of (a, b). 
Prove that f’(x) S(x). 
10. Given that f(x) is convex on [a, b], prove that f(x) and f(x) are 
nondecreasing on (a, b). Prove that 


Fila) < LAS < pg) 


whenever a a6 <b. 

11. Prove that if f(x) is convex on [a, b], then there are no more than countably 
many points of [a, b] at which f(x) fails to have a derivative. 

12. Let f(x) be differentiable on [a, b] and suppose that, given any two points a, 
B (a Sa < B Sb), there exists a unique point y such that 


Ife _ S(B) —f(a) 9 

Pi (2) 
Prove that either f(x) or —f(x) 1s convex on [a, b]. 

13. Prove that if f(x) is convex in a neighborhood of a point x9 E (a, b), then its 


graph does not go below either the left-hand tangent or the right-hand tangent at 
% =X (Bee. 7.190), 1.6. 


Ax) 7 (x — xo filo) + flo) &Sx0), 

flx) 2 (x — xo)f'(xo) + fix) (& 7x0). 

14. Prove that a differentiable curve y = f(x) defined for all sufficiently large x 
has an asymptote y = kx + b (see Chapter 4, Problem 10) if and only if both 
limits 


k=limf'(x), b= lim [ f(x) —2f"(x)] 
exist. 

15. Given a function f(x) defined on [a, b], suppose f(x) has a derivative f(x) at 
every point of (a, b), where f(x) — p as x “ a. Prove that p is the right-hand 
derivative of f(x) at x =a. 

16. Suppose f(t) is increasing on the interval 0 S ¢ S b, a with decreasing 
derivative f' (t) on 0 StS b (the derivative may become arbitrarily large as t “s 0). 
Prove that 


17. Following van der Waerden, let 


x if0<x<F, 
l—-x ifi<x<l, 


Po(x) -| 


and then continue @ (x) by periodicity with period 1 (see Sec. 5.65c) over the 
whole real line. Next let 


l ca 
~, (x) = qa Pol4 x), 


so that @,(x) has the period 4” and a derivative everywhere, equal to either 1 or 


—1, except at the points with abscissas p/2:*” (p = 0, + 1, +2, ...).+ Prove that the 
function 


fe) = ¥ als) 


is continuous everywhere but fails to have a derivative at every point xo. 


+ G. E. Shilov, Linear Algebra, Sec. 1.31. 

+ As an exercise, the reader should calculate the derivative of the more complicated function 
y= [foxy ]8O) = 8(x)In fx). 

+ As always, the radical denotes the positive square root. 

+ See Sec. 5.67, esp. Figure 7. 


+ The term “semitangent” can be used instead of “tangent” to emphasize that (23) and (23') are rays (half- 
lines) rather than “full” straight lines. 


+ That is, if and only if there exists a linear function (1) whose deviation from the function y = f(x) is “of a 
higher order of smallness than h.” Note that two linear functions y4 and yp differ from one another by a 


quantity “of the same order of smallness as h,” in fact by the quantity (A — B)h proportional to h. 

+ In this case, we regard [a, b] as an interval of the extended real number system R (see Sec. 1.9). 

+ By a neighborhood of a point xg we mean a set of the form Us = {x: |x — xQ| < 6} and by a deleted 
neighborhood of xq we mean a set Us minus the point xo itself, i.e., a set of the form Us! = {x: 0 < |x — xq] 
< 6} (cf. Sec. 4.15a). 

+ A function is said to be differentiable on a set E if it is differentiable at every point of E. 

+ These points correspond to “corners” of the curve y = (x). 


8 Higher Derivatives 


8.1. Definitions and Examples 


8.11. Let y = f(x) be a function differentiable on the interval (a, 5), with 
derivative f(x), and suppose the function f(x) is in turn differentiable on (a, b). 
Then the derivative of f(x) is called the second derivative of f(x), denoted by y” 
= f"(x). Repeating this process, we get a sequence of functions f(x), f(x), f"(x), ..., 
f(x), ... 


on the interval (a, b), where 


fY x) = [f"~ Y@®)]'(n = 1, 2, ...). 


The function f(x) is called the nth derivative (or the derivative of order n) of 
fix). The operation leading from f(x) to f(x) is called n-fold differentiation of 
fix). If the function f(x) exists and is continuous, we say that f(x) is smooth (or 
continuously differentiable). The function f(x) itself is regarded as its own 
derivative of order zero: f(x) = f(x). 

8.12. a. rHeorem. [f f(x) and g(x) have derivatives of order n at every point of (a, 
b), then [kf{x)]™ = f(x), 

(Ax) + g(x)] = ~) + g(x) for arbitrary real k and every x © (a, b). 

Proof. Apply formulas (3) and (4), p. 224 repeatedly. 


b. rHeoreM (Leibniz’s rule). /f f(x) and g(x) have derivatives of order n at every 
point of a b), then 


[/(x) g(x) J = Le (x)g@"(x), (1) 


in terms of the binomial coefficients+ 
n! 


C= —_————— A | 
ki(n—k)! 


Proof. The proof is by induction. If (1) holds for some n, then differentiating (1) 
again, we get 


(i?* 1) = DG” +fe* 1) g(n—k)] 
: b) (n-k+1) , SY k k+1 
= y) Ces gin eo > rf gin ) 
k=0 k=1 
n 
=f Y (CECE SMe FOE 
n+1 
= be i st ise’ ‘asa “i 
k=0 
because of the easily verified formula 
Ci+Cp_=Cz*. 


Thus the validity of (1) for n implies its validity for n + 1. Since (1) obviously 
holds for n = 1, the induction is now complete. 


8.13. The following table shows the higher derivatives of various commonly 
encountered functions: 


J(*) f'(*) I" (*) a f(x) 
x axt— a(a—1)x*-2 a(a—1)---(a—n+1)x*~" 
e* ae** a°e™* hs a"e** 

] ] n-1(n—1)! 
In x : - ‘ia bi} oe 
: 7 a“ nm 
sin bx b cos bx — b* sin bx ve 6” sin (++ =) 

. 2 nit 

cos bx —b6 sin bx —b* cos bx oe b” cos («+ 7) 


Note that differentiating a polynomial has the effect of lowering its degree by 1, 
so that n-fold differentiation of a polynomial of degree n leads to a constant, 
while all derivatives of order n + 1 and higher vanish. 


8.14. Given any polynomial 
P(x)= ¥ ax (2) 
k=0 


of degree n and any real number a, we can expand P(x) in powers of x — a as 
follows. Substitution of x = (x -— a) + a into (2) gives 


P(x) =P((x—a) +a) = }) a,((x—a) +a)*, 


which can be written in the form 
P(x) =bo +b, (x—a) +6,(x—a)? +--+ +b,(x—a)". (3) 


To find the coefficients bo, b,, ..., b,, we first set x = a in (3). This gives P(a) = 
bo. 
We then differentiate (3) with respect to x, obtaining 


P'(x) =b, +26,(x—a) +3b5(x—a)?+--- +nb,(x—a)"~}. (4) 
Setting x = a then gives 
P(a)= by. 
Similarly, differentiating (4) and setting x = a, we get first 


P"(x) = 2b, + 3-2b3(x — a) + ++» +n(n— 1)b,(x — a)” * 


and then 

P"(a) = 2bp. 

Continuing this process, we find more generally that 

PM(a) = kb, (k = 0, 1, ..., 2), (0! = 1) or equivalently, 
l 


b= Pa). (5) 
Substituting (5) into (3), we finally get 
" P® (a 
P(y= ¥ 7) (ea)! (6) 
k=0 & 


8.2. Taylor’s Formula 


8.21. Lemma. Suppose the functions F (x) and G(x) are continuous on a finite 
interval [a, b], with derivatives up to order n + | inclusive on (a, b), where the 
functions G(x)(k = 0, 1,..., 2) are nonvanishing on (a, b). Suppose further that 
the functions 


FHx), GYayk = 0, 1, ..., ”) approach zero as xa. Then 
re) (1) 


Gb) GFN) 
for some point c © (a, b). 


Proof. By Cauchy’s theorem (Theorem 7.43), there is a point c, © (a, b) such that 
F(b)—F(a) _ F(b) _F’(4) 


G(b)—G(a) Gb) G'(c) 
Applying Cauchy’s theorem again to the interval (a, c,), we find a point c, © (a, 


G(b) G'(q) G'(c,)- Ga) G"(e) 
Continuing this process, we finally find at the (m + 1)st step that 


F(b) F(c,) F@(c,)-F@(a) _ FC*(c) 
G(b)  G™c,) G(c,)-—G™ (a) GeFD(c) 


for some point c © (a, Cy) = a,b). I 


8.22. rHeorEm. Suppose the function f(x) is continuous on afinite interval [a, b], 
with derivatives up to order n + | inclusive on (a, b). Suppose further that the 
functions f©(x)(k = = 0, 1, ..., 2) approach finite limits f(a) as xa. Then 
Taylor’s formula 

ee ia) led 
Ai)= YOO hb a) (2) 
holds for some point c © (a, b). 


Proof. For n = 1 formula (2) reduces to the finite difference formula (3’), p. 239. 
Moreover, f”* )) (x) = 0 for a polynomial of degree n, and we then get formula 
(6) on the preceding page. More generally, let 


n (k) a 
F)=f(0)- POM af, Gls) =e ay" (3) 


Then F(x) has derivatives up to order n + 1, like f(x) itself, while the function G 

(x) has derivatives of all orders, with G(x) 7 O(k = 0, 1, ..., n) for x 7 a. 
n ¢{k) (m) 

Moreover, | ae (xa) =f (a), [(x—a)**1]™ =0 


k=0 x= 


for m = 0, 1, ..., 2, since clearly 


0 ifk#m, 


[(x—a)],= f ifk=m, 


and hence 
F™) (a) = f™ (a) — f™ (a) = 0,G™ (a) = 0(m = 0, 1, ..., 2). 


Thus F(x) and G(x) have all the properties of the functions figuring in the 
lemma, and furthermore 
FO) =fO*D(x), GO (2) = (n+ 1)! (4) 
Therefore, substituting (3) and (4) into (1), we get 
n (k) a 
#06) — YE) (6a)! 


G(b) (b—a)"*? (n+1)!’ 
which is equivalent to (2). fl 


It is tacitly assumed in Theorem 8.22 that a < b, but Taylor’s formula remains 
valid in the case b < a provided that b < c < a (check this). Thus, regardless of 


the position of the point 5 # a, (2) holds for some point c between a and b. 


8.23. Taylor’s formula is often written in the form 


OF gree) n+ 
I(x) = LG x—a)*+ snr : (a<c<x). 


The first term on the right is a polynomial of degree n, called Taylor’s 


polynomial, while the second term 
R,(x, a) = _ ernie (x—a)"*! (5) 
‘ (n+1)! 


is called the remainder (in Lagrange’s form). If f" * (x) is bounded as x “ a, 
the remainder is “of a higher order of smallness than (x — a)",” i.e., R,(x, a) = 


o((x — ay") 


(see Sec. 4.38a). In particular, it is easily verified that the following expansions 


hold for x = 0 or x ” 0: 
ealteth +. +5 ; +(x"), (6) 
x? _x* x2" 
a a ae ae | en 2n+1 : 7 
cos x= 1 ota +(-1) (nit ) (7) 
2 ,3 yant 
: ee ee se =, = 2n+2 , 8 
sin x=x TT +(-1) Gnzni ) (8) 


In (142) =x ot (—1 HE $000"), (9) 
(1+x)*=1+ax+ st 7 aa) eS ai). (10) 


8.24. The importance of Taylor’s formula consists in the fact that it allows us to 

replace the function f(x), which may well be very complicated, by a 

comparatively simple function, i.e., a polynomial, with an error (5) which in 

many cases can be estimated quite simply and made as small as we please. 
sup 


Suppose it is known that 2. 04, ™(x)| = M,,(h) (n = 0, 1, 2, ...). 


Then we have estimate 


M,,+ (A) nt 
Rx (0) <p : 


for the remainder (5). Denoting the expression on the right by @ = a(n, h), we 
now pose three natural problems, each involving the determination of one of the 
quantities w, n, and h in terms of the other two. 
(a) Given n and h, find w. In other words, find an upper bound for the error made 
in replacing a function f(x) on the interval (a, a + h) by its Taylor polynomial of 
degree n. 
(b) Given h and a, find n. \n other words, find the interval (a, a + h) on which 
the error made in replacing f(x) by its Taylor polynomial of degree n is 
guaranteed not to exceed a given quantity w. 
(c) Given h and a, find n. In other words, find the degree of the Taylor 
polynomial such that the error made in replacing f(x) by the polynomial does not 
exceed a given quantity w on a given interval (a, a +h). 

In concrete cases all three problems can be solved by a more or less 
elementary calculation (see Problem 10). The same problems can be posed for 
the interval (a — h, a) and solved in the same way. 


8.3. More on Concavity and Inflection Points 


8.31. In Sec. 7.5 we used the values of the derivative /(x) in a neighborhood of 
the point xp to determine the position of the curve y = f(x) relative to its tangent 


at Xo. We now consider another approach to the same problem, which uses the 
value of the second derivative at the single point x = x, rather than the whole set 
of values of f (x) near xo. 


THEOREM. Suppose f(x) has a first derivative f’(x) in some interval (a, b) 
containing the point x) and a second derivativef f"(x9) at the point xq itself. Then 


he) is concave downward at Xo if f" (Xo) < 0 and concave upward at Xo iff" (Xo) 
0. 


Proof. If f"(xo) <0, then, applying formula (6), p. 234 to the function f(x), we see 
that there is a 5 7 0 such that f(xy — A)” f’ (xo) 7 (Xp + h) whenever 0 Sh <6. 
But then f(x) is concave downward at x9, by Theorem 7.53a applied to the 


interval (Xp — 0, X9 + 6). The case f"(x9) > 0 is treated similarly. ff 


8.32. The above theorem leads to new sufficient conditions for the presence of 
local extrema: THrorem. Suppose f(x) has a first derivative f'(x) in some interval 
(a, b) containing the point xp and a second derivative f"(X9) at the point xq itself. 


Then fix) has a local maximum at xo if f'(xo) = 0 and f'(xo) > 0. while f(x) has a 
local minimum at xo if f'(xo) = 0 and f"(x9) 7 0. 


Proof. An immediate consequence of Theorem 8.31 and the argument given in 
the proof of Theorem 7.54. + 

8.33. What extra information about the function f(x) is given by a knowledge of 
the values of f(x) in a neighborhood of the point xp) as well as at the point x 


itself? It turns out that a knowledge of f"(x) allows us to determine the position 
of the curve y = _ f(x) relative to the parabola 
Y=P(x) =f(%o) +f" (%0)(*—X0) + EF" (*0) (*—%0)’; (1) 
called the osculating parabola to the curve y = f(x) at the point x9. We begin 
with the analogues of Lemmas 7.52a and 7.52b. 


a. tema.. Let C be the curve with equation 


y =f\x\(a <x <b), and let P be the osculating parabola to C at the point Xo. 
Then C lies above P to the left of xo(x < Xo) Ff'(O) > S' (Xo) for all ¢ E (a, X9)while 
C lies below P to the left of xo if f"(© <= f"(xq) for all € (a, xq). 


Proof. By Taylor’s formula with n = 1 we have 


F(x) =f(%0) +f (*0) (x -%0) +3" (6) (* —%o)? 
>f(x) +f" (x0) (Xo) +3. (Xo) (x —*0)? =P (x) 


if f"(xo), while fix) < P(x) if f"(O <f"(%). Fl 


b. vemma. Let C and P be the same as in the preceding lemma. Then C lies above 
P to the right of xo (x > Xo) if f"(n) ” S' (Xo) for all n (xo, 5), while C lies below 
P to the right of Xo if f"(n) <f"(x)for all 4 © (Xo, b). 


Proof. The same as above, with é replaced by 7. fl 


8.34. If f"(xp) = 0, the osculating parabola with equation (1) reduces to the line 
with equation y = f(xp) + /(xo)(x — Xo); 


i.e., the tangent T to the curve C at the point xp. This observation leads to the 
following analogue of Theorem 7.54: tHEorem. Suppose f" (X9) = 0. Then f{x)is 
concave upward at X9 if f"(¢) > 0, f"(n)7 0 


whenever a + é< Xo < b and concave downward at aU TS <0,/"(y) <0 


whenever a + é* X9 <7 <b. On the other hand, Xq is an inflection point of f{x) 
iff” 07 f(y) or 

{"()* 0 < f"(n) whenever aS ~~ <n Sb. 

Proof. An immediate consequence of the above two lemmas. ff 


8.35. Treating the second derivative f’(x) in the same way as the first derivative 
{'(x) was treated earlier, we now make use of the value of the third derivative f 
(x) at the single point x = x, rather than the whole set of values of /"(x) near x9. 


This leads to the following analogue of Theorem 8.31: THrorem. Suppose f(x) has 
first and second derivatives f'(x) and f"(x) in some interval (a, b) containing the 
point x9 and a third derivative J"(x) at the point Xq itself, and let C and P be the 


same as before. Then C lies above P to the left of xy and below P to the right of 
Xo if F (Xo) < 0, while C lies below P to the left of Xg and above P to the right of 
xq if F"(xo) 7 0. 


Proof. lf f"(xo) < 0, then, applying formula (6), p. 234 to the function f"(x), we 
see that there is a 6 7 0 such that I'(% — h) ” T' (Xo) > f"(x +h) whenever 0 <h< 
6. But then, by Lemmas 8.33a and 8.33b, C lies above P to the left of x9 and 
below P to the right of xy. The case f(xy) 7 0 is treated similarly. ff 


8.36. tHEorem. Let f(x) be the same as in the preceding theorem. Then Xp is an 
inflection point of f(x) 1£f"(xo) = 9, F “(Xq) #0. 


Proof. An immediate consequence of the preceding theorem, since P reduces to 
the tangent T to the curve C at the point x if f’(x9) = 0. 


If f (Xo) = 0, the behavior of f(x) near x9 1s determined by the sign of the first 


nonvanishing derivative f”) (x9), as shown in Problem 11. 


8.4. Another Version of Taylor’s Formula 


8.41. We now give another way of writing Taylor’s formula, based on the use of 
the “asymptotic unit” EF’ (see Sec. 4.39). This way of writing Taylor’s formula is 
useful in a variety of problems involving the behavior of functions near given 
points. Suppose f(x) has derivatives up to order n + 1 inclusive in a 
neighborhood of a point a, where f(a) # 0 and f(x) is bounded. Then 
Taylor’s formula (2), p. 252, can be written as 


flab) = fla) +86") +--+ fea) += fae, (1) 


where E' denotes a quantity approaching | as h — 0. In fact, in this case, F is just 


a3 & SM 
wo TET FO” 


8.42. Comparing (1) with formulas (6)—(10), p. 254, and choosing the indicated 
values of n we get Tf 


e=1+xE (n=1), (2) 

x? 
cos Weel oe (n=2), (3) 
sin x=xE (n=1), (4) 
In (14+x) =xE (n=1), (5) 
(1+«)*=1+axE (n=1). (6) 

Choosing larger values of n, we get the following more accurate formulas: 
P 
Cultst+— FE (n=2), (2°) 
x? x* 

=l|—-—+—E = 4), 3’ 
Cot dnl (n=4) (3°) 

Ps 
sin x=x——E (n=3), (4’) 


In (tx)=2+ 58 (n=2), (5’) 


(Ita)f=1+ax+ 285) 25 (n=2). (6’) 


8.43. Examples 


a. Evaluate 
. €&—Ccosx 
Hi ee . 


Solution. Using (2)-(4), we find that 
a SE ee ene SEE ees <p | 
sin x xE E 


as x —> 0. The same result can also be obtained by using L’Hospital’s rule for 0/0 
(Theorem 7.61). 


b. Evaluate 


& — /1—x72 +33 


lim , 
xo =>. In (1+ x*) 


Solution. Using (2), (5), and (6), we find that 
n 


as x — 0, a result which can again be obtained by using L’Hospital’s rule for 
0/0. 


In(1+x7) x°E  £E 2 


c. How does the function 


flx)= x > z 


behave near the point x = 0? 


Solution. By (4'), we have 


as x — 0. TO investigate f(x) further, we use the more exact expression 


x x 
‘ ne * oe 
sin x=<x 6 * 120 


implied by formula (8), p. 254. We then get 


es 
I) = 5 —To0” 


which shows that f(x) has a local maximum at x = 0. 


8.5. Taylor Series 


8.51. Let f(x) be a function with derivatives of all orders at every point of an 
interval (a, b).+ Choosing any point xo © (a, b), we write Taylor’s formula 


(") /y 
fl3) = Flea) +f" (eo) #— 9) +202 4D (= n9)" + Ry ltt) 


with remainder 


(n+1) 
Ry (4%) = es —x9)**! (1) 


where c lies between x, and x (see Sec. 8.23). Suppose (1) approaches zero as n 
— oo, for some given value of x. Then the (infinite) series 


(") (y ros) (") 
fl) +f 0) (80) + +E (gy nem YEE 
(2) 


is convergent, with sum f(x). The series (2), written for an infinitely 
differentiable function f(x), is called the Taylor series of f(x), regardless of 
whether or not the series converges and regardless of whether or not the sum of 
the series (if convergent) equals f(x). 


8.52. Clearly, the Taylor series (2) is convergent with sum f(x) if and only if the 
remainder (1) approaches zero as n — ©, in which case f(x) is the sum of its own 
Taylor series and we are entitled to write f 


() 
fe) = YES) (a) (3) 


A condition guaranteeing this is given by the following 


THEOREM. If 
joes 
n'B" g< 

for suitable constants B 7 0 and C 7 0, then lim R,(x,%9) =0 

for all x © (a, b) such that \x — xo| < VB. 


su a | f(x)| <C (n=0,1,2,...) (4) 


Proof. It follows from (1) and (4) that 
< Hosa can + 1)!=C(Blx— —x9|)"*?. 


But the expression on the right approaches zero as n > © if |y—x |< 1/B. I 


8.53. Examples 


a. The inequality (4) holds true for any polynomial of degree m, since f™*)(x) 
and all higher derivatives vanish. Thus every polynomial has a Taylor series 
expansion, which reduces to the finite sum (6) already found on p. 251. 


n) l = 
b. If fx) = e*, then f(x) = e*, so that mS sup FO) < 


for alla 7 0, B 7 0. But the right-hand side aoraelies zero aS n — oo (with a 
and B fixed), and hence forms a bounded sequence.y+ It follows from Theorem 


8.52 that the Taylor series of e* converges to e* on every interval (— a, a) and 
hence — all real x 


falteth + +5 +e (—w<x<o). (5) 


c. If f(x) = sin x, then 


° nv 
+— 1 
sin(: = 


for all x (cf. Sec. 8.13), so that 


Y= 


l n 

nip? SUP. If*(*)l< 

for all a7 0, B 7 0. The right-hand side again approaches zero as n — © (with B 
fixed), and hence forms a bounded sequence. Therefore the Taylor series of sin x 
converges to sin x for all real x 


roe 


sin x= at ot hea (—c<x< 00). (6) 


d. In just the same way, we find that cos x has the Taylor expansion 
x 
carl-S+a- haa (—co<x<00). (7) 


8.54. In our treatment of the trigonometric functions (see Sec. 5.6), we defined 
sin x and cos x as_ the functions satisfying the formulas 
sin? x+cos* x=1, (8) 


sin (x+y) =sin x cos y+ Cos x sin y, (9) 


cos (x+y) =cos x cos y—sin x sin y, (10) 


‘ sin x 
O<sin x<x<——, (11) 
cos x 


where (11) holds for all sufficiently small positive x, say for 0 <x < 9. The fact 


that sin x and cos x have the expansions (6) and (7) now proves the uniqueness 
of sin x and cos x. To complete the theory, we must still prove the existence of 
functions sin x and cos x satisfying (8)-(11). This will be done in Secs. 8.66 
—8.68, by the simple expedient of showing that if we define the functions sin x 
and cos x by (6) and (7), then they satisfy (8)—(11). 


8.55. Analytic continuation. Given that f(x) is infinitely differentiable on an 
interval (a, b) and has a Taylor expansion 


(iat, wel e (12) 


" ! 
n=0 n: 


on (a, b), suppose we consider (12) in the complex plane by replacing the real 
variable x by the complex variable z = x + iy. This gives the power series 


y 4(2—%o)", (13) 
n=0 


whose region of convergence is some disk G of radius p centered at the point x 


(see Theorem 6.62). At the very least, o must be such that the whole interval (a, 
b) lies inside the “circle of convergence” I = {z: |z — xp| = p}, since the series 
(13) diverges at every point outside I’ (see Sec. 6.63). By hypothesis, the sum of 
(13) equals f(x) at every point x ¢ (a, b). At the other points of G, the sum of (13) 
is some function of z which we denote by f(z) and call the analytic continuation 
of f(x). In particular, if (13) converges for all real x, then (13) converges for all 
complex z, 1.e., the analytic continuation of f(x) is defined in the whole complex 
plane. 


8.6. Complex Exponentials and Trigonometric Functions 


We now use analytic continuation to define the exponential and trigonometric 
functions “in the complex domain,” i.e., for complex values of the argument. 


8.61. The Taylor series 
x2 
2! 


converge for all real x (Examples 8.53b), and hence the series 


- 
e=1+x+ eee (1) 


2 n 
futies=. De ee eee 
2! n! 


(1’) 


converges for all complex z, therby defining the function e* on the whole 


complex plane (as the analytic continuation of e*). 


8.62. Similarly, the Taylor series 


- sf 


scala iy Wie | Clalial 


and 


x? x* x® 


csx=l— +a 


converge for all real x (Examples 8.53c and 8.53d), and hence the series 


doses 


2 2 7 


‘ z 
sin zez—ata oat 


(3) 


(2') 


and 


z* z* 26 


cos z= l- 91 + 41 — 6! 
converge for all complex z, thereby defining the trigonometric functions sin z 
and cos z on the whole complex plane (as the analytic continuations of sin x and 


cos x). In particular, it follows from (2’) and = (3’) that 
sin (— z)=-—sin z, (4) 


of ose (3’) 


cos (— z) =cos z, (5) 


1.e., sin z 18 odd and cos z is even for complex z (just as for real values of the 
argument). 


8.63. The expansions (l')—(3') simply a remarkable connection between the 

exponential and trigonometric functions in the complex domain. In fact, 

replacing Z by Zz in (1), we find that 
2 3 4 5 


Pad Hie he Fan ee 


2! 3! 4! 5! 


zs xz 2 2 
=(1-5+5-~)+i(2-F+§-~) 


and hence 
e* =cos z+i sin z (6) 


for arbitrary complex z, a result known as Euler’s formula. Replacing z by — z, 
with the help of (4) and (5), we get 
e~'? =cos z—i sin z. (7) 


Together (6) and (7) imply the formulas 


= : 8 
COs Z 5 (8) 


sin z= 


2a” 


which can also be obtained directly from (1')—(3’). 
Letting z be a real number @ in (6), we get 


é®=cos 0+isin 0. (10) 


Thus every complex number of the form e’” (6 real) has absolute value 1. It will 
be recalled from Sec. 5.72 that every complex number z can be written in the 


trigonometric form 

z=r(cos 8+7sin 8), (11) 
where z has absolute value 7 and argument @. Combining (10) and (11), we can 
write Ue in the exponential form 

z=re’®. (11’) 


8.64. rHeorem. The formula 
eri +72 = gg? (12) 


holds for arbitrary complex z, and Zp. 


Proof. Both series 


2? z2 
Pil Peet eos Peal tay he rs 


are absolutely convergent (why?), and hence, by the theorem on multiplication 
of complex series (Theorem 6.46), 


z zs 
eal + (tz) +(F +4243) 


3 2 2 3 
21 , 2122 , 2122 , 22 
+(H+ H+ ita) + 


(2; +22)? | (Z14+22)° 


alt tae | 
8.65. In particular, if z = x + iy (x and y real), then, by (12) and (10), we have 
7 =e" *? = e*e"¥ =e*(cos y +i sin y) =e* cos y + ie* sin y, (13) 


thereby explicitly exhibiting the real and imaginary parts of e*. Formula (13) 


reveals an interesting property of e*, namely its periodicity in the complex plane. 
As in the real case (see Sec. 5.65c), a function f(z) such that f(z + 7) = f(z) 


for all complex z and some 7 is said to be periodic, with period T (in general, 7 
iS itself complex). According to (13), 
trai xtiyt2ni — x tilyt2n) 

=e*[cos (y+2z) +i sin (y+2z)]=e*(cos y+7 sin y) =e’, (14) 


so that e” is periodic, with the purely imaginary period T = 2zi. More generally, 
2 + 2kri — ez 


for all k= 0, + 1, +2, ... (why?). 
On the other hand, the functions e” and e ” are periodic, with the real period 7 
= 2x. For example, changing z to iz in (14), we get e”7 * 27! = ell + 27) = ez 


and similarly for e ”. It then follows from (8) and (9) that sin z and cos z are also 
periodic, with period 27.7 
Setting z = iz in (13), we get the interesting formula 


ef =—1, 
8.66. Next, using Theorem 8.64 to multiply formulas (6) and (7), we find that 
el7¢~ 2 — 2° —] = (cos z+i sin z)(cos z—i sin z) 

=cos* z+sin? z, (15) 
which reduces to formula (8), p. 261 for real z = x. Unlike the real case, 


however, we cannot infer from (15) that sin z and cos z do not exceed | in 
absolute value, since sin z and cos z are now complex. In fact, setting z = iy (v 


real) in (8) and (9), we get .,. _ e+e 
9 > 


e *—& 
2” 
so that cos iy and sin iy become arbitrarily large in absolute value as y > + ». 


sin y= 


8.67. Suppose we replace z by z, in (6) and z by z, in (7), where z, and z, are 


arbitrary complex numbers, and then use Theorem 8.64 to multiply the resulting 
formulas. This gives 


cos (Z, +22) +i sin (z, + 2.) =ef@! +42) = iti gee 
= (cos Zz, +7 sin z,)(cos z, +2 sin z3) 
= (cos Z; COS Z,—sin Z, sin Z2) +i(cos z, sin Z,+sin Z, Cos Z3), 


and similarly, 


cos (Z, +22) —i sin (z, +2z,) =e7 41 F#2) ae tg 
= (cos Zz, —1 sin z,)(cos z,—i sin Z,) 


= (cos Z, Cos Z7—sin Zz, sin Z,) —i(cos z, sin Z,+sin Z, COS Z>). 
First subtracting and then adding these formulas, we find that 


sin (z, + Z,) =sin z, cos z+ Cos zZ, sin Z3, 


cos (Z, +2Z,) =cos z, cos z,—sin Zz, sin Z>, 


which reduce to formulas (9) and (10), p. 261 for real z; = x, z, = y 8.68. In 
keeping with the discussion of Sec. 8.54, to complete the theory of the 
trigonometric functions, we need only show that the functions sin x and cos x, 
regarded as defined by the series (2) and (3), satisfy the inequalities 


sin x 


O<sinx<x< (16) 
cos x 
for all sufficiently small x 7 0. By formula (4’), p. 258, 
3 2 
sinx=x—FE=x(1- 25) (E-1 as x0). (17) 
; > 2 
But, for all sufficiently small x ~ 0, 0<1 -= F<] 
and hence 
0 < sin x < x, because of (17). Moreover, by formulas (3) and (6), p. 258, 


2 
Cos x=1-TE (E-1 as x0), 


(1+x)!=1-xE(E- 1 as x— 0), which imply 


sin x x? x* _\~! 
—=x| |1——E }j |-—- — 
cos x +( 6 )( 2 ) 


x? x x? 


This time, for all sufficiently small x 7 0 


2 
x 
1+—E>1 
15> 


(18) 


and hence 


sin x 


>x, 
cos x 


because of (18). 
This completes the proof of (16), thereby accomplishing the program of Sec. 
8.54. 


8.7. Hyperbolic Functions 


8.71. Definition. The functions defined by the series 


z* 2 
cosh zal tates (1) 
: ae 
sinh siliaidl taal 7 alii (2) 


(which converge for all complex z), are called the hyperbolic cosine amd the 
hyporbolic sine, respectively. 


8.72. Replacing z by —z in the series 


z 2 | 
faltz+ 5+ ote (3) 


(Sec. 8.61), we get 


Comparing (1)-(4), we find that 
e’ =coshz + sinh z, 
e *~=coshz — sinh z. 
Solving these equations for cosh z and sinh z then gives 


cosh z= i S ‘ (5) 


e—e * 


(6) 


sinh z= 


In particular, we have 


cosh x = ere ' (5’) 
2 
sinh x=“ (6’) 


for real z = x. The functions cosh x and sinh x have the graphs shown in Figure 
20. 


8.73. The hyperbolic functions are intimately related to the trigonometric 
functions in the complex domain. In fact, replacing z by iz in (5) and (6), we get 


iz —iz 
cos Z= 


= cosh 12, 


iz —iz 


isin z= = sinh 2z. 


Then, replacing iz by z, we find that 
cosh z = cos (— iz) = cos iz, 
sinh z = i sin (— iz) — i sin iz. 


Thus, to within constant factors, the trigonometric functions are obtained from 
the hyperbolic functions and the hyperbolic functions from the trigonometric 
functions by multiplying the argument z by i. This corresponds to rotating z 
through a right angle in the complex plane (see Sec. 5.72). 


Figure20 


8.74. The following formulas are all easy consequences of the results of Secs. 
8.66, 8.67, and 8.73 (for arbitrary real x, y and complex z, z,, 25): 
sin (x +72y) =sin x cos ty + Cos x sin ty 
=sin x cosh y+ cos x sinh y, (7) 
cos (x +iy) =cos x cos ty—sin x sin ty 
= cos x cosh y—1 sin x sinh y, 
cosh? z—sinh? z=cos? (iz) +sin? (iz) =1, 
cosh (z, +22) =cos 2(z, + Zz) = Cos tz, COs iz, —sin iz, sin iz, 
=cosh z, cosh z,+sinh z, sinh z,, 
sinh (z, +Z,) = —isin 2(z, + Z,) = —i(sin iz, Cos 1Z, + COs iz, sin iz) 
=sinh z, cosh z,+ cosh z, sinh z. 
In particular, we have 


cosh 2z=cosh?z +sinh?z, 
sinh 2z=2 sinh z cosh z 


and 


sinh (x + iy) = sinh x cosh iy + cosh x sinh iy 


= sinh x cos y+icoshx sin y, 


cosh (x + iy) = cosh x cosh iy + sinh x sinh iy 
= cosh x cos y+i sinh x sin y. 
To differentiate cosh x and sinh x (for real x), we need only note that 


8.75. Thus we see that the theory of the hyperbolic functions is in certain 
respects simpler than that of the trigonometric functions. The adjective 
“hyperbolic” is explained by the fact that the curve in the xy-plane with the 
parametric representation x = cosh f,y = sinh t 


is the hyperbola x7 — y* = 1. The corresponding curve for the trigonometric 
functions has the parametric representation x = cos t,y = sin ¢ 


and is just the circle x7 + y? = 1. Hence the trigonometric functions are 
sometimes called “circular” functions. 


Problems 

12 S18 
1. Prove that the function e! cannot be represented as a Taylor series in 
powers of x, although it has derivatives of all orders at x = 0. 

2. Prove that the function 


i 
y=x" sin- 
x 


has derivatives up to order n inclusive at x = 0, while the nth derivative y" *) (0) 
fails to exist. 

3. Prove that if all the roots of a polynomial P(x) of degree n are real, then so are 
all the roots of the polynomials P’(x), P'"(x), ..., P(x). 

4. Suppose f(x) is bounded and has continuous bounded derivatives f(x) and /"(x) 
on (— 0,00). Prove that M? <2M,M3, 


where 


M,= sup |f(x)|  (&=0,1,2). 


—o<x<0 


5. Prove that the second derivative /"(x), if it exists, can be found by taking the 

following limit of a quotient involving the values of f(x) at three neighboring 

points: p+(g) = tim LA) = Ale +A) + fle+2h) 
h-0 h? 

Write an analogous formula for f(x). 

6. Prove that if f(x) is convex in the sense of Chapter 2, Problem 7 and if f(x) has 

a second derivative at x = xq then f"(xq) 2 0. 

7. Prove that if f’(x) 7 0 for all x ¢ [a, b], then f(x) is convex on [a, b]. 

8. Prove that the equation sin z = 0 has no roots in the complex plane other than 

the real roots z = 0, + z, + 27, ... 

9. Prove that there is a constant C 7 0 such that |sin z| 2 C on the set of all circles 

Iz] =(n + 2)a(n = 0, 1, 2, ...). 

10. Estimate the error @ made in replacing the function e* on the interval [0, 1] 

by its Taylor polynomial of degree 10. On what interval [0, 4] does the function 

e* differ from its Taylor polynomial of degree 10 by no more than 10°’? For 

what value of n does the function e* differ from its Taylor polynomial of degree 

n by no more than 10’ on the interval [0, 1]? 

11. Suppose f(x) has derivatives up to order n in some interval containing the 

point x9 and a derivative of order n + 1 at the point xq itself, where /(xo) = ... = 


f (x9) = 0,f(n + 1)%o) # 9. 


Describe the position of the curve y = f(x) relative to its tangent at the point xo, 
assuming that (a) f+) (x9) 7 0, n even; (b) f”* !) (xq) 7 0, n odd; (c) f”* ) (x9) 
<0, n even; (d) f"* ) (x9) <0, 2 odd. 

12. If f(x) can be written in the form f(a + h) = f(a) + Ah + O(h) 

in a neighborhood of the point a, then, by Theorem 7.21, f(a) exists and equals 
A. Suppose f(x) can be written in the form fia + h) = fla) +f(a)h + ZBh2 + O(n?) 
in a neighborhood of the point a. Does f’(a) necessarily exist? 

13. Prove the inequality 


is an integer to prove that e is irrational. 


15. Show that the infinite product 


I (1 + W,) 


(1) 


diverges if 


] 


O,.=(— TERT 


although 
> W, (2) 
k=1 


converges. Show that (1) converges if 


wo, =~ W/E}, 


although (2) then diverges, 
16. Prove that 


2 x* xin 2 
| ae (Qn—2)1- 
x? x* x 
ae 5 ol Bi: 3 
esa Tae (3) 
for all x # 0. 


17. Deduce from (3) that 2.82 << 3.19. 
18. Suppose y = f(x) has derivatives up to order n inclusive at every point of an 
interval (a, b), and define the differential of order n of f(x) as the function 


d"y =f (x) (dx)". (4) 


Prove that d"y = d(d"'y) if dx is regarded as constant. Calculate d”y for the 
function y = x. Find the second differential of a composite function z = g(y) = 
g(f(x)), and show that, unlike the first differential, the result does not agree with 
the expression obtained for the case where y is the independent variable, unlessis 


a linear function Ax + B, in which case d" z= g(y)(dy)". 
Write Taylor’s formula in terms of higher differentials. 


+ By definition 0! = 1, a! = 1.2---n. 
+ The limiting values fa) are actually the corresponding right-hand derivatives (see Chapter 7, Problem 
15), but this fact plays no role here. 
+ Alternatively, use Theorem 7.54 and the argument given in the proof of Theorem 8.31. 
+ Formula (6) has already been found in Theorem 5.58e. 
+ Such a function is said to be infinitely differentiable on (a, b). 
+ We often call (3) the Taylor (series) expansion of f(x) at x = xq. 
+ More exactly, the numerical sequence 
oe = 
T!B’ J1pe""*? nip” 


is bounded. 


+ The periodicity of sin x and cos x for real x (see Sec. 5.65c) has already been used in (14). 


t “This remarkable formula symbolizes, as it were, the unity of all mathematics, with — 1 representing 
arithmetic, i algebra, z geometry, and e analysis.” (A. N. Krylov) + Thus the stipulation that all the wz have 


the same sign cannot be dropped in Chapter 6, Problem 12. 
+ We can use (4) to interpret the common notation fM(x)= d"y ; 
rr 


9 The Integral 


The preceding two chapters have been devoted to one of the basic concepts of 
mathematical analysis, namely the derivative. We now turn to the equally 
important concept of the integral. 


9.1. Definitions and Basic Properties 


9.11. Partitions with marked points. A set of points x9, x), °**, x, of a closed 
interval [a, b] such that 


a=Xo Sx, S.. Sx =b 


is called a partition of [a, b] into subintervals [x;, x, + ,], and the points xo, x), ..., 
x, themselves are called points of subdivision of [a, b]. If, in addition, a point ¢ 
is chosen in each subinterval [x;, x, + ,], so that 


x, & E Sy, (k= 0, 1, ..., 2-1), 


our partition is said to be a partition with marked points. We denote such a 
partition by IT, or in more detail, 


Bap eG Mi a BR Cg Bx = 

The difference x, 4 | — x, is denoted by Ax,, and the number 
d(I1) = max {Axo, Axy, ..., Ax,_1} 

is called the parameter of the partition IT. 


9.12. Riemann sums. Let y = f(x) be a (finite) function defined on [a, b]. Then, 
starting from a given partition IT with marked points, we can form the Riemann 
sum 


Su f)= "DS Gadde (1) 


9.13. The integral. The finite number / is called the (Riemann) integral of the 


function f(x) over the interval [a, b] if, given any ¢ 7 0, there exists 6 7 0 such 
that 


— Sy(| Se 


for every partition II with d(Il) < 6. This definition can be examined in the light 
of the general definition of a limit. To this end, let E be the set of all partitions IT 
(with marked points) of the underlying interval [a, b], and, given any 6 70, let E 5 
be the subset of E consisting of all partitions of E such that d(II)< 6. Then, 
obviously, given any two sets £5, and, Es, either Es, = E;, or Es, Es, while the 
intersection of all the sets E'; is empty. Hence the sets LE; make up a direction 7 
on the set E (see Sec. 4.11), which can then be used to construct limits of 
functions defined on E. In fact, the general definition of a limit (Sec. 4.12) takes 
the following form when applied to the present situation: The number J is said to 
be the limit in the direction JT of a function S(II) defined on the set E of all 
partitions IT, and we write 


I=lim S(1), (2) 
Zz 


if, given any ¢ 7 0, there exists a set E 5 © T such that 
\1—S(I)| <e (3) 


for all © £ 5 or equivalently if there exists a 0 > 0 such that (3) holds for all II 
such that d(II) < 6. The limit (2) will also be called the limit of S(I1) under 
arbitrary refinement of the partition. 

In particular, let SII) be the function S,(f) defined by (1). Then the existence 
of (2) is equivalent to the existence of the integral of the function f(x) over [a, 5]. 
The integral / is denoted by 


b 
| I(x) dx, (4) 


an expression resembling the original Riemann sum (1). However, at the same 
time that the expression (1) indicates the operations involved in calculating the 
Riemann sum, the expression (4) is “unsplittable,” at least for the time being, 
since there is no operation that corresponds to multiplying f(x) by dx and then 
“integrating” the result. The suitability of this way of writing the integral, rather 
than just 


| "Alx), 


say, will be apparent later,s On the other hand, it is clear that changing the 
“variable of integration” has no effect on the value of the integral, so that, for 
example, 


b 


| St) dem KG ac= | sf at 


d Ja 


The number a is called the lower limit of integration, while b is called the upper 
limit of integration. An integral with limits of integration a and 5 is often simply 
called an integral “from a to b.” 

The integral of a given function f(x) may well fail to exist. However, as will 
be shown in Secs. 9.14 and 9.16, continuous functions always have integrals, 
and so do certain simple discontinuous functions. On the other hand, “strongly 
discontinuous” functions like the Dirichlet function (Sec. 9.17) do not have 
integrals. If the integral of f(x) over [a, b] exists, we say that f(x) is integrable on 
[a, b]. 


9.14. We now prove that every continuous function is integrable on the interval 
[a, 5]. 


a. Let wd) denote the modulus of continuity of the function f(x) on [a, 5], 1.e., 
let 


@,(6)= sup | f(x’)—S(«")]. 


|x'-x"|<6 
It will be recalled from Sec. 5.17c that 
lim @,(6) =0 
80 
for every continuous function on [a, b].7 
b. A partition 
Il'= {a=x'g S 6) Sx’, Se Sx a Rx -j Re ay’ = b} 
is called finer than a partition 


T= {a= xy Moy xy AS Bx Be Ry Sx, = 3} 


n 


if every point Xo, ..., x, 1s one of the points xp, ..., x'p (so that the set x’, ..., x'p is 


obtained by adding extra points of subdivision to the set x9, ..., x,), With no new 
requirements whatsoever being imposed on the marked points ¢, €/;. 


c. tema. If the partition Tl is such that d(11) < 8, then 
Sn (F) —Sa(F)| < @,(0) (6 —@) 


for every finer partition IT’. 


<a 
ww 
~~ 


Proof. In going from the partition II to the partition II’, each subinterval [x;, x; 4 
;] acquires new points of subdivision and new marked points which, with a 
slight change of notation, can be written as follows: 
be tig Sy a Ct ek 
The part of the Riemann sum Sj coming from [x;, x; + ;] is of the form 
me—1 
x SE nj Are; (Ax, j= 4541 —%k,)- 
I= 
The term f(¢,)Ax, in the Riemann sum Sy, can also be written as a sum 
mye—1 


Y SerdAry, 


of the same form. Clearly, 


my 1 me~ 1 


x SE, Ax 5 — bs AG) Ax, ;= %, Lf C9) —S (Ey) JAx,;, 


and hence, since d(IT) < 6, 


mie— 1 


Y Meu)" E le)ey|<"Y MG) KEN <0, (3), 


Summing over k, we finally get 


EY, Adda D eddy 


k=0 j=0 


<«,(5)(6—2). § 


d. temma. Jf Il, and HW, are any two partitions such that d(11,) < 6, d(11,) < 5, 


then 
Sn, () —Sn,(F)| $2@, (5) (6—a). (6) 


Proof. Let II, be a new partition using all the points of subdivision of I, and II, 


as its points of subdivision, with the marked points being chosen arbitrarily. The 
partition II, is clearly finer than both II, and IH,. Hence, by the preceding 


lemma, 

Sn, (F) —Sas(F)| <a,(0)(5—a), 

ISn.(F) —Sns(F)| <@, (8) (6—a), 

which together imply (6). ff 

e. We are now in a position to prove the following key 

THEOREM. /f f(x) is continuous on [a, b], then f(x) is integrable on [a, b]. 


Proof. By Cauchy’s convergence criterion for the existence of a limit in a 
direction (Theorem 4.19),} we need only prove that, given any ¢ 7 0, there exists 
a 57 0 such that 


IS, - Sn, * 
whenever d(II,) <6, d(II,) < 8. Since f(x) is continuous (and hence uniformly 


continuous) on [a, b], given any ¢ 7 0, we can find a 87 0 such that 


E 
(3) < 3b —a) 
But then, by the —_ lemma, 


Sn. (F) —Sn.( S| <2577— (0-4) =e 


6 a) 
for any two partitions IT, and II, such that d(II,) <5, d(II,) <5. I 


9.15. Before considering the integrals of a wider class of functions, we now 
prove a number of general properties of integrals, assuming that the integrals in 
question exist, without using any special properties of the functions involved. 


a. rHeoreM. If the function f(x) is integrable on [a, b] and if k is constant, then 
kf(x) is also integrable on [a, b| and 


| “Af(s) d= [ se) dr, (7) 


i.e., a constant can always be brought out in front of the integral sign. 
Proof. In fact, 


Saf) = ¥ MENA =k YD ME)Axy=WSa(S), 


where the right-hand side clearly has a limit under arbitrary refinement of the 
partition IT, equal to 


b 
‘| S(x) dx. 


But then the left-hand side has the same limit under arbitrary refinement of the 
partition, thereby proving (7). 


b. roeorem. Jf the functions f(x) and g(x) are integrable on [a, b], then so is the 
function f(x) + g(x) and 


b 


b b 
| fle) +2(x)] de | fla) det | g(x) de, (8) 


a 


i.e., the integral of a sum equals the sum of the integrals of the separate terms. 
Proof. For any partition II we have 

n—1 
Sn( +8) a [A(Sx) +8 (Sq) Ax, 


=¥ fedant ¥, #(6)Ax=Sn(f) +50) 


Taking the limit of both sides under arbitrary refinement of the partition, we 
deduce the integrability of f(x) + g(x) and the validity of (8). I 


c. THEOREM. Every integrable function f(x) is bounded. 


Proof. Suppose f(x) is integrable but unbounded on the underlying interval [a, 5], 
and consider any partition I of [a, b]. Then f(x) is unbounded on at least one of 
the subintervals [x;,, x, +], and hence, by fixing the marked points in the other 


subintervals and varying ¢; in [x;, x; +41], we can make S,(f) arbitrarily large in 


absolute value. But then S)(f) cannot have a finite limit under arbitrary 


refinement of the partition, i.e., f(x) fails to be integrable. This contradiction 
shows that f(x) must be bounded. fl 


d. rueorem. [f f(x) and g(x) are integrable on [a, b] and satisfy the inequality f(x) 
S g(x), then 


b b 
| flx) dx< | a(x) de. (9) 


Proof. The Riemann sums S7,(f) and S,(g) for the same partition I] with the same 
marked points ¢, satisfy the inequality 


Sa S)="E Mdbn< ES elEdn=Sule) 


which gives (9) after taking the limit as (II) - 0. I 


e. In particular, if (x) is integrable on [a, b] and f(x) SM, then 


| fos dx< [ae dx = M(b—a), (10) 
while if f(x) 2 m, then 

| fo dx> [_mde=m(6—0) (11) 
In particular, if f(x) is nonnegative, we can choose m = 0 in (11), so that the 


integral of f(x) is also nonnegative, while if f(x) is nonpositive, we can choose M 
= 0 in (10), so that the integral of f(x) is also nonpositive. 


f. Let f(x) be integrable on [a, b] and let 
m S fix) S$ M(a Sx Sd). 
Then, by (10) and (11), 


m(b—a)< [ fi) dr<at(o—a) (12) 


The quantity 


I(x) dx (13) 


] 
b—aJa 


is called the (integral) mean value of f(x) on the interval [a, b]. It follows from 
(12) that 


b—a 


m<z~_| fade<M (14) 


THEOREM (Mean value theorem for integrals). /f f(x) is continuous on [a, 5], 
then there exists a point & © [a, b] such that 


paz), £0) eM 


b—al, 

Proof. Let 

m= inf f(x), M= sup /(x) 
asxsb asxsb 


in (14). By Weierstrass’ theorem (Theorem 5.16b), there are points p, g © [a, b] 
such that f(y) = m, f(g) = M. If p <q, then by (14) and the intermediate value 
theorem (Theorem 5.23), f(x) takes the value (13) at some point & © [p, g] = [a, 
b], and similarly if p 7 g. 


g. rHEorEM. /f f(x) and |f(x)| are integrable on [a, b],7 then 


b mb 
| Six) ax| < |, eolae (15) 


2a 


Proof. It follows from 


“\Ax)| S for) S [foo] 
and Theorem 9.145d that 


-{ \fa)ides | Sx) de< | I f(x)|dx, 


which is equivalent to (15). li 


h. tHeorem. If f(x) is integrable on the intervals [a, c] and [c, b], whereac <b 


then fix) is integrable on the interval [a, b| and 


’ 


[ fc) a = [fe avs I fla) dx (16) 


Proof. Let 


l= {a=x, &x, &...&x,,& 


m 


< < 


< = 
Mm CY cf a BO, SO 


be any partition of the interval [a, b]. The corresponding Riemann sum with 
marked points ¢, (k= 0, 1, ..., 7 — 1) can be written in the form 


a-l m—t n— 1 
Sn(f) = p> I(O,) Ax, = 2», IS) AX ALS mn) AX + =, FS) AX 


Le 


=F Medd tM (em) +/00 ma 18) 
: by SE Ax, + (LS Em) —L) xm 


k=m+1 


=Sn,(F) + Sn. ( SF) + 1S (Em) —S(6) A%m (17) 


where II, and II, are the corresponding partitions of the intervals [a, c] and [c, 
b]. Under an arbitrary refinement of I, the left-hand side of (17) approaches 


b 

| fle) 

while the right-hand side approaches 
c b 

[fis ae | fs) a 


In fact, an arbitrary refinement of II obviously leads to an arbitrary refinement of 
both II, and II,, while at the same time causing the last term in the right-hand 


side of (17) to approach zero, since f(x) is bounded (being integrable) and Ax,, 
0. I 


i. Next we prove the converse assertion: 


THEOREM. /f f(x) is integrable on the interval [a, b], then f(x) is integrable on the 
intervals [a, c] and [c, b], where ac <b. 


Proof. We need only give the proof for the interval [a, c], since the proof for [c, 
b] is completely analogous. Given two partitions 


fa xg Cy x ic, Bc), 
II’ = {a=x9 SE) Sx’, BE, ©... &x', =} 


of the interval [a, c], suppose we enlarge IT and II’ in the same way to form 
partitions 


) er ea oe oe ee pee rae 
W=(4=0,8e,87, Re, 6 Ms 


Then, since f(x) is integrable on [a, b], given any ¢ 7 0, there exists a 8 7 0 such 
that 


\Sa(f) — Sa(f)| < 

whenever d(I1) < 8, d(Il’) < 5. But obviously 
Sa(f) — Sif) = Sy) — SrA), 

and hence 

SnD) — SrA) Se 


whenever d(II) < 8, d(II’) < 8. Therefore the Cauchy convergence criterion for 
the existence of a limit in the direction d(II) — 0 is satisfied for the Riemann 
sums of the function f(x) on the interval [a, c]. It follows that f(x) is integrable on 
[a,c]. 


Ops 


j. THEOREM. [f f(x) is integrable on the interval [a, b] and if a = co eo = 


b, then f(x) is integrable on every interval [Co, C1], ..., [C,— 1, C,] and 
b C1 Cn 

[ft i= | S(x) drt | S (x) dx. 
a €o fn-1 


Proof. Apply the preceding two theorems repeatedly. fl 


k. Finally we prove a useful estimate for the deviation of an integral from its 
integral sums: 


tHEoreM. If f(x) is integrable on [a, b] and d(Il) < 8, then 


b 
| fix) dx—Sq( f)| <0,(6)(6—a). 


Proof. Take the limit as d(II’) > 0 of the inequality (5). I 


9.16. Next we show that a number of discontinuous functions are integrable. The 
class of discontinuous functions to be considered here is admittedly rather small, 
but is quite sufficient for our subsequent purposes. 


a. Lemma. Let h(x) be a function equal to 0 for a <x < b but taking arbitrary 
values h(a) and h(b). Then h(x) is integrable on [a, b| and 


[.a) dx =0. (18) 


Proof. The Riemann sum of h(x) for any partition IT reduces to 


W(Co)Axg + AC, — Ax 


n-—-I|> 


where /(¢)) equals 0 or h(a) and A(¢,, — ,) equals 0 or A(b). But this expression 
obviously approaches 0 as d(Il) > 0. I 


b. temma. Given a function f(x) continuous on [a, b], suppose f\(x) coincides with 
f(x) on the open interval (a, b), while taking arbitrary values f, (a) and f, (6). 
Then f(x) is integrable on [a, b] and 


[ f(x) de= | " Fedok. (19) 


Proof. The function h(x) = f;(x) — f(x) is integrable on [a, b] and satisfies the 


conditions of the preceding lemma. Hence A(x) is integrable on [a, 5b] and 
satisfies (18). But then the function /;(x) = f(x) + h(x) is also integrable on [a, b] 


and satisfies (19), because of (18) and Theorem 9.15b. fl 


c. A function f(x) 1s said to be piecewise continuous on [a, b] if there exists a 


partition 
a =X) <x, << <x, =5 


such that f(x) is continuous on every open interval x, = = x; +41 and has finite 
limits 


Kx + 0), fx, a 0), fxy + 0), Ax» _ 0), Ax» + 0), Kx, _ 0) 
(see Sec. 5.32).T 
THEOREM. Jf f(x) is piecewise continuous on [a, b], then f(x) is integrable on [a, b]. 


Proof. Let f(x) denote the function defined on x, <x Sx, 4 ; which coincides 
with f(x) for x; <x< x; 4 1 and takes the appropriate limiting values f(x, + 0) and 
fx; + 1 — 0) at the end points of [x,, x; 4 ;. Then f(x) is continuous on [x;, x; + 1] 
and hence integrable on [x;, x, + ,;], by Theorem 9.14e. Moreover, by the 
preceding lemma, f(x) is integrable on [x;,, x, +] and 


*k+1 ak+t 
fe) dem [4 ae 

Xk Xk 
Applying Theorem 9.15h repeatedly, we see that f(x) is integrable on the whole 
interval [a, 5]. 

It follows from Theorem 9.15j that 

b n—-1 (xXe+1 n—-1 Xk+1 
| foo dx= )) | f(x) = | Si (x) ax, 

a k=0 J xx k=0 Xk 
so that calculating the integral of a piecewise continuous function reduces to 
calculating integrals of continuous functions. 
d. In particular, every piecewise constant function 


hy =const ifa=xyp<x<x,, 
h, =const if x; <*<x, 


h(x) = | 
h,_,=const ifx, ,;<x<x,=6 


is integrable, with integral 


Xk+ 


b n—1 (x41 n-1 
| h(x) x=} | h(x) dx= | 
Xk k=0 x 


1 n-1 
dx = >. hy (X44 1 —%): 
a k=0 k=0 


kk 


Incidentally, this shows that the Riemann sum S7,(f) of every function f(x) is the 
integral of a piecewise constant function 4;(x), namely the function taking the 


constant value h, = f(é,) in each interval x, <x Sx, 44. 


e. THEOREM. /f 9(x) is a nonnegative piecewise continuous function and if 


b 
| g(x) dx=0, (20) 


a 


then g(x) can be nonzero only at a finite number of points. 


Proof. Suppose first that g(x) is continuous, and suppose g(x) # 0 on [a, b]. Then 
there exists a point c © [a, b] such that g(c) # 0, and hence, by the continuity of 


g(x), a constant C 7 0 and a whole interval [a, £], a <a << b such that g(x) 2 
C for all x © [a, ]. But then 


b rp na b 
| g(x) dx= | g(x) dx+ | g(x) dx+ | a(x) dx 
a B 


a az 


B 
> | a(x) dx>(B—a)C>0, 


az 


contrary to (20). It follows that g(x) = 0. 
Now suppose g(x) is a piecewise continuous function, with points of 
discontinuity c) =a Sc, <... <c, =b. Then, by Sec. 9.16c, 


b —1 (cnr 
[ee an='¥ | eden, 


a k=O Jick 


and hence 


Ck+1 
| g(x) dx=0 (k=0,1,....n—1), 
since each term is nonvanishing. But is continuous on (c;, c; + 1), and hence g(x) 
= 0 at every point of (c;, c, + ;). The only points where g(x) can fail to vanish are 
the points Cp, Cc), ..., C,, of which there are only finitely many. | 


9.17. We have just proved the integrability of piecewise continuous functions, a 
class of functions with only finitely many points of discontinuity. We now 
inquire about the integrability of functions with infinitely many points of 
discontinuity. In this connection, consider the Dirichlet function 


(x) 1 if x is rational, 
A+) = 5 ar: P 
XM) 0 if x is irrational. 


Given any partition IT (of the unit interval [0, 1], say), we get an integral sum 
a—1 
x (Cn) Ary, (21) 


equal to | if we choose the points ¢, to be rational. However, if we choose the 
points ¢;, to be irrational (with the same partition II), we find that the sum (21) 


equals 0. Hence the Riemann sums of the Dirichlet function do not approach a 
limit under arbitrary refinement of the partition, i.e., the Dirichlet function is 
nonintegrable. 

Thus we see that, so to speak, the boundary between integrable and 
nonintegrable functions lies somewhere in the region of functions with infinitely 
many points of discontinuity. In fact, there is a theorem which asserts that a 
function f(x) has a Riemann integral if and only if the set of all its points of 
discontinuity can be covered by a finite or countable set of intervals whose total 
length is less than 6, where 5 7 0 is arbitrary (see Problem 9). However, in 
problems involving the integration of discontinuous functions, it is best to use 
another kind of integral, which is more general than the Riemann integral, 
namely the Lebesgue integral. 


9.18. Suppose f(x) is integrable on the interval a <x Sb. Then we set 
a b 
[sfes) de — | fle) ds (22) 
by definition. In particular, (22) implies 
| ST (x) dx=0. 


The formula 


[700 = [fe v4 [7 ds (23) 


is now valid for arbitrary a, f, y . [a, b], regardless of their relative position in 
the interval [a, b], In fact, (23) has already been proved for the case a <8 < y 
(see Theorem 9.15h). Choosing another configuration of the points a, /, and y, 


say y +6 <a, we have 


a Bp a 
[7ts) dem [7s de | 7) a 
y y B 
as already proved, and hence, by (22), 


[00 ax= = ["flx) ax= — |" 714) de [s0) 


8 
= [pts ae ("7 ds 
by (22) again. The argument is the same for any other configuration of the points 
a, 6, and y. 
Note that (22) immediately implies 
L ("Ae d= — [ey de= + [pte a (24) 
a—b}, ~ 7 ae erg F 
Hence, if f(x) < M, the estimate (14) can be written in the alternative form 
m< ; | F(x) dx <M. (25) 
a—b & 


In other words, in estimating the mean value of f(x), it does not matter whether a 
<borb<a 


9.2. Area and Arc Length 


We now give two geometrical examples illustrating the utility of the concept of 
the integral in mathematical analysis. A much more detailed discussion of the 
applications of integration will be given in Secs. 9.6—9.9. 


9.21. Area under a curve. The concept of the area of a simple geometric figure 
bounded by straight line segments is familiar from elementary geometry. But 
what is the “area” of a figure bounded by more general curves? The answer to 
this question is not to be found in elementary geometry.t 

A natural starting point for a general theory of area is the following axiom: 
The area of a figure ® is a number no larger than the area Q of any 
“elementary figure” (by which we mean a finite plane region whose boundary 
consists of a finite number of simple closed polygonal linest) containing ® and 


no smaller than the area Q of any elementary figure contained in ® (see Figure 
21). Since we always have w S Q, by the theorems of elementary geometry, it 
follows that 


sup w Sinf Q 


(see Theorem 1.62b), where the least upper bound on the left is taken over all 
elementary figures contained in ® and the greatest lower bound on the right is 
taken over all elementary figures containing ®. If sup w = inf © then the only 
possible value of the area of ® is the common value 


® 


Figure21 
S=sup w=infQ. (1) 
Suppose we now apply these general considerations to the case of the area of 
the “curvilinear trapezoid” ® bounded on the bottom by the segment [a, b] of 
the x-axis, on top by the graph of a continuous function y = f(x) 2 0, and on the 
sides by segments of the vertical lines x = a and x = b. The area of this trapezoid 


is often simply called the “area under the curve y = f(x)” (from x = a to x = b). 
Consider a partition 


WM fe ag eG My a a Cg M8}, 
where the point &, © [x,, x; , |] is chosen so that 


fl&)=M,= max f(x). 


XkSXSXkK+1 


Then the corresponding Riemann sum 


g= 2 
snl t) = 2 M,Ax, 


can be interpreted as the area of an elementary figure containing ®, made up of 
certain “circumscribed rectangles” (see Figure 22). On the other hand, if we 
choose ¢; so that 


S(&)=m= min f(x), 


XeSXSXK+1 


then the corresponding Riemann sum 


a Xe Mea ‘ 
Figure22 
a XE X41 b . 
Figure23 
a= 1 
Sn(J) = 2, m,Ax;, 


can be interpreted as the area of an elementary figure contained in ®, made up of 
certain “inscribed rectangles” (see Figure 23). Therefore, according to our basic 
axiom, every number S which can lay claim to be a value of the “area” of the 
figure ® must satisfy the condition 


Sn) =SSSSy(f. 


But f(x) is integrable, being continuous (see Theorem 9.14e), so that S,(f) and $ 
n() approach the same limit, equal to the integral of f(x) over [a, b], under 
arbitrary refinement of the partition IT. It follows that 


S= ‘ f(x) dx (2) 


is the only possible value of the area of the figure ®. The fact that S exists as 
defined by (1) in terms of arbitrary elementary figures (instead of just 
elementary figures of a special kind, made up of rectangles) can be shown with 
the help of the general considerations given in Appendix B (esp. Secs. B.4, B.5, 
and B.7). 

In three-dimensional space, the concept of area goes over into that of 
“volume.” A theory of “measure” (1.e., “generalized volume’) and integration on 
a compact metric space is outlined in Appendix B. 


9.22. Are length. In elementary geometry, one encounters first the concept of 
the length of a line segment or more generally a polygonal line, and then the 
concept of the length of a circular arc, defined as the limit of the length of a 
polygonal line “inscribed” in the arc as the length of the segments making up the 
line approaches zero. Without lingering to make this latter definition more 
precise,t we pass at once to the problem of defining the length of an arbitrary 
curve L with equation 


» =f (x) 


=a Xk Xq+1 b= x, 

Figure24 

y =flxy(a Sx Sb) 

Consider a partition 

= {a= xg ey Mx, Bic) By BM ag MC, BX, 5}, 


where the points ¢, ¢), ..., ¢, — 1 are as yet undetermined, and construct the 


polygonal line Ly, with vertices at the points (Xo, Vo), (X1, V1), +++» Xe Vn) 
Then Ly, 1s “inscribed” in the curve L, in the sense illustrated in Figure 24. 
Now let s(Zy)) be the length of Z,,. Then clearly 


a= 1 
s(Ly) = », V/ (Ax,) + (Ay,)*; (3) 


where Ax; = x, 4 1 — Xp Avy = Ve +4 1 — Ve Suppose f(x) 1s continuous and 
differentiable, with derivative f(x). Then, by Lagrange’s theorem (Sec. 7.44), we 
can choose a point ¢;, e (X;, X44 1) such that 


J (Ax,)? + (Ay,) = /1+ (Ay, /Ax,)*Ax, =, /L+[F'(&) T+’ (&,)]7A%,. 


We can then write (3) in the form 
n—1 
5(Ly) 2 VI+LS'(&,))°Ax,. (4) 


This expression is a Riemann sum of the function 
g(x)=J1+[/'()]?, 


corresponding to the partition II with marked points ¢, ¢, ..., ¢, — 1. If g(x) 1s 
integrable, then the right-hand side of (4) approaches the integral 


[.e(s) 


under arbitrary refinement of the partition II. Thus, finally, we define s(Z), the 
length of the curve L, by the formula 


b 
s(Z) = | J! +[f'(x)]? dx. (5) 


9.23. Writing integrals like (2) and (5) would be of little value, and would in fact 
be hardly more than a tautology, if there were no way of calculating integrals 
other than that given in the direct definition of Sec. 9.13. Actually, however, 
there are other powerful methods for evaluating integrals (presented in Secs. 9.4 
and 9.5). These methods rest on the considerations of Sec. 9.3, of great 
importance in their own right. 


9.3. Antiderivatives and Indefinite Integrals 


9.31. The integral as a function of its upper limit. Let the function f(x) be 
integrable on the interval [a, b], and consider the integral 


x 


Fa)= | fo dt (a<x<b), (1) 


where we denote the variable of integration by ¢ to avoid confusion with the 
upper limit of integration x. Then (1) is a new function of x, defined on the 
interval [a, 5]. 


a. rHEorEM. Zhe function F(x) is continuous on [a, b]. 
Proof. An immediate consequence of the estimate 


F(x") —F(x")| = 


a<=x<b 


[Ff at] <(e"—») sup, 1) 


(see Secs. 9.15e and 9.18). I 


b. rueorem. [f f(x) is continuous from the left (or from the right) at x = c, then 
F(x) has a left-hand (or right-hand) derivative at x = c, equal to f(c). 


Proof. Clearly 


F(c+h)—F(c) 1 cth c 1 cth 
le FO (0 a [fo milf a (2) 


where the expression on the right is the mean value of f(x) on the interval [c, c + 
h]. + Let 


m,= inf f(x), M,= sup f(x). 


xe[c,c+h] xe[c,c+h] 
Then 
eth F(c+h)—F(c) 
m<; | fit) aT <M, 
c 


(cf. formula (14), p. 280 and formula (25), p. 286). If f(x) is continuous from the 
left (or from the right) at x: = c, then m, — fic), M, — f(c) as h 4 0 (orh 0), 


i.e., the limit of (2) as h 7 0 (or h “ 0) exists and equals fc), or equivalently, 


F(x) has a left-hand (or right-hand) derivative at x =c, equal to f(c). 


corotuary. If f(x) is continuous (from both sides) at x = c, then F(x) has a 
derivative at x = c, equal to fic). 


Proof. F(x) has a derivative at x = c if and only if F(x) has equal left-hand and 
right-hand derivatives at x = c (Sec. 7.19c). But, by the preceding theorem, these 
two one-sided derivatives exist and equal f(c). 


Note that if f(x) is continuous on [a, b] but on no larger interval, then the 
derivative of F(x) at the end point x = a can only be a right-hand derivative, 
equal to f(a), while the derivative of F(x) at the end point x = b can only be a 
left-hand derivative, equal to f(b). 


9.32. Antiderivatives. Let f(x) be a piecewise continuous function on [a, 5]. 
Then any continuous function G(x) defined on [a, b] with a derivative equal to 
fix) at every continuity point of f(x) is said to be an antiderivative f(x) on [a, b]. 
For example, the function x” is an antiderivative of nx” ~ ! on every closed 
interval. Given any piecewise continuous function f(x), it follows from the above 
corollary that the function F(x) defined by (1) is an antiderivative of f(x). If G(x) 
is an antiderivative of f(x), then obviously so is every function 


H(x) =G(x) +C, (3) 


where C is a constant (so that, in particular, f(x) has infinitely many 
antiderivatives) . The converse is also true, as shown by the following 


THEOREM. Every antiderivative H(x) of the function f(x) on the interval [a, b] is of 
the form (3), where G(x) is any fixed antiderivative of f(x) on [a, b]. 


Proof. Clearly 
[H(x) — G(x)]'= fx) — fix) = 9 


at every continuity point of f(x), and hence, by Theorem 7.45b, H(x) — G(x) is 
constant on every interval on which f(x) is continuous. But G(x) and H(x) are 
continuous on the whole interval [a, b], and hence so is H(x) — G(x). Therefore 
H(x) — G(x) is constant on the whole interval [a, 5]. 


9.33. The following key proposition, relating the operations of differentiation 
and integration, is often called the “fundamental theorem of calculus”: 


THEOREM. Jf f(x) is piecewise continuous on [a, b], then 


| fla)ae=6 (0 -G(a), (4) 


where G(x) is any antiderivative of fix) on [a, b]. 


Proof. By the preceding theorem, every other antiderivative of f(x), in particular 
the function (1), differs from G(x) by a constant, so that 


F(x) = | Ae dt=G (x) +C. 
To determine the constant C, let x = a. Then obviously F(a) = 0, and hence C = 
—G(a), so that 

[ft d=G(x)-G(@), (5) 
To get (4), we now set x = b in (5), afterwards changing the variable of 
integration from ¢ to x. 


9.34. Indefinite versus definite integrals. The set of all antiderivatives of a 
function f(x) is called the indefinite integral of f(x), denoted by 


| T(x) dx 
(without limits of integration). By contrast, the expression 


b 
| fos) as 
(defined in Sec. 9.13) is called the definite integral of f(x). The adjectives 
“indefinite” and “definite” are often dropped in cases where the kind of integral 
under discussion is clear from the context. 


9.35. Formula (4) is often written as 


x=b 


[fe de =G(s) 


x=a 


or simply 


b 
’ 
a 


[\fe) #s=6() 


where each of the expressions on the right stands for the value of the function 
G(x) at x = b minus its value at x = a. Thus, for example, 


b 
== 5", 


0 


b 
| nx~* de=x" 


0 


Geometrically, this means that the area under the curve y = x, _ , from x = 0 to x 
= b equals 1/n times the area of the rectangle ABCD shown in Figure 25 (where 
the points A and B have abscissas 0 and b). 


9.36. Let f(x) be a function continuous on [a, 5], and let F(x) be the indefinite 
integral of f(x), so that F’(x)= f(x). Then, by the definition of the indefinite 
integral, 


d | 4) dx =d[F (x) +C]=F'(x) dx=f(x) dx, 


so that the signs d and J (in that order) cancel each other out. On the other hand, 
by the definition of the differential and the indefinite integral, 


Figure25 


(ar) = fre) dx =F (x) +C, (6) 
so that the signs J and d (in that order) cancel each other out, provided we add a 
constant to the right-hand side. 


9.37. Next we prove a number of general rules of indefinite integration, starting 
from the formulas 


d(ku) =k du (k constant), (7) 
d(u+v) =du + dv, (8) 
d(uv) =u dv+v du, (9) 


proved in Sec. 7.33. It follows from (6) and (7) that 
\ du = {a(t =ku+C=k(u+C,) =k [a 


where C and C, = C/k are arbitrary “constants of integration,” 1.e., a constant can 


always be brought out in front of the indefinite integral sign (cf. Theorem 9.15a). 
Similarly, (6) and (8) imply 


| (du-+de) = [d(u+2) =ut040= [aus [a (10) 


where the constant of integration is absorbed into the indefinite integrals on the 
right, i.e., the indefinite integral of a sum equals the sum of the indefinite 
integrals of the separate terms (cf. Theorem 9.15b). Finally, it follows from (6), 
(9), and (10) that 


(( dv+v du) = [ dv + [> du= [Cu =uv+C, 
or equivalently, 
{ dv =uv — [od (11) 


a result known as the formula for integration by parts (the constant of 
integration has been absorbed into the second integral). This important formula 
allows us to determine the integral J u dv from a knowledge of the integral J v du. 


9.38. THEOREM. Suppose the function u(t) has a continuous derivative u' (t) on a S 


t SB and takes values in A Su SB, while the function flu) is continuous on A'S 
u SB, and let 


Flu) = | fo) du, G(t)= | flu(t))w"() a 


Then 

F(u(t)) =G(t) +C, (12) 
i.e., du can be replaced by u'(t) dt in the indefinite integral Ju) du.* 
Proof. We need only prove that both sides of (12) have the same differential. But 
dF(u) = F'(u) du = F'(u)u'(t) dt = f(u(d)u'(b) dt 
(see Sec. 7.34), while 
dG(t) = fu(t))u'(t) at. Oi 


9.39. Finally we prove the converse of Theorem 9.31b. Given a function F(x) 
continuous on [a, b], suppose F(x) has a derivative F’(x) at all but a finite 
number of points of [a, b], where F"(x) is piecewise continuous on [a, b]. Then 
F(x) is said to be piecewise smooth on [a, b]. In effect, Theorem 9.31b says that 
if f(x) 1s piecewise continuous on [a, b], then 


F(x) = | fit) dt 

is piecewise smooth on [a, b]. Conversely, suppose G(x) is piecewise smooth on 
[a, b], and let f(x) be its piecewise continuous derivative. Then G(x) is an 
antiderivative of f(x), by Sec. 9.32, and hence 


G(x) = [70 dt+G(a), 


a 


by the same argument as in the proof of Theorem 9.33. It follows that to within 
an additive constant every piecewise smooth function G(x) is the integral of a 
piecewise continuous function (with a variable upper limit of integration). 


9.4. Technique of Indefinite Integration 


9.41. Integration of polynomials 
a. Applying the sign | to both sides of the formula 
d(x”)= nx"~'dx(n = 1, 2, ...), 


and then changing 1 — | to m and dividing by n =m + 1, we get 


m+1 


7+ (m=0,1,2,...), (1) 


Ea dx = 


* 


m+ 
where C is an arbitrary constant. 


b. Let 


nN 


x)= ag Page Sa 


be any polynomial. Then, using (1) and the rules of Sec. 9.37, we find that 


2 +1 
ee a (2) 


[Pe dt magt + 44-5 ory, 


c. In some cases, it is more economical to use a less direct approach. For 
example, it would be absurd to evaluate the integral 


x(x? + 1)5° dx 


« 


by first using the binomial theorem to expand the expression (x7 + 1)°°° and then 
applying (2). The sensible thing to do is to use Theorem 9.38 together with the 
substitution u = x* + 1. This at once gives 


l ] yl (2* 4-1)°"* 


[x24 1)50" de= [urs dy= ——_ + C= rT) (3) 


2 2501 1002 
d. To evaluate the integral 
[ee 4 iye™ dx, 
we use formula (11), p. 295 to integrate by parts, setting 


u =x?,dv = x(x? + 1)? dx. 


Since 
_ _ (x? + aie 
du=2x dx, v= 1002? 
by (3), the result is 
[P+ 1)5° de =uv— [» du 
ya 2) _ pais 


1002 1002 
eae bly - (x? + bas +C 
7 1002 1002+ 502 


9.42. Integration of rational functions. Next we consider the problem of 
integrating a function of the form 


eed a 
fe) = 50 


where S(x) and P(x) are both polynomials with real coefficients. 


a. If the degree of S(x) is greater than or equal to that of P(x), we can carry out 
the division, thereby reducing f(x) to the form 


Q (x) 
P(x)’ 


f(x) =T (x) + 


where 7(x) and Q(x) are new polynomials, and the degree of O(x) is now less 
than that of P(x). Thus, having just seen how to integrate polynomials, we need 
only consider the problem of integrating the proper fraction 


Q (x) 

P(x)” 

b. Let x;+iy;, ..., x,+/y, be the nonreal roots and x, ; 1, ..., X, the real roots of 
P(x), with corresponding multiplicities 7), ..., 7, 7) +1, -+-» %q-7 Then the rational 


function O(x)/P(x) has the following partial fraction expansion (see Sec. 5.88): 


Q (x) A,,+B,\x A,,,+B,,%* 


P(x) (x—x,)*+97— [(#—*1)? 497)" 
A,,+B,,x A, +B,, x 
+ Gata tes) 
4 Apes 4.4 Apetvees 4... 
H— Ros e—s,,4)"** 
4 4g 4.4 Aare, 
—~, (<—x,)" 


Thus our problem reduces to that of integrating the various kinds of partial 
fractions. 


c. To evaluate the integral 


[aD 


(x—x;,)" 


we integrate both sides of the formula 


l l—n 
a( ) ~ (x—x;,)" ri 


x—x,)"~ 
obtaining 
dx l 
ere ee m 
d. If n = 1, we start from formula (12), p. 228, which implies 
d(In|x—x,|) = - 
X—X,y 
and hence 
dx 
=In|x— C. 
[A =mnix—a14 (5) 


Note that (4) and (5) hold in any interval [a, b] which does not contain the point 
Xp. 


e. The substitution x — x; = y,u reduces the integral 


| Ax+B 
[(x—x,)* +9)” 


to the simpler form 


pee 
(u“ +1)" 


If D = 0, the integral is easily evaluated by making the further substitution u? + 1 
=v: 


C °H 
eee a 
Cudu C (dv _ Q1—ne* 1“ F(—n) (etl) ppt test n>Il, 
(u aps |e 


In lo] +C,=Sln (u?+1)+C, ifn=1. 
f. We must still consider the integral 
du 
i= ; 6 
' leur ©) 
If = 1, then 
oe =arc tan u+C. 
u*+1 y 


by formula (22), p. 231. Suppose n 7 1. Then applying the formula for 
integration by parts to (6) with 


- l Bewe- 2nu du 
(w+ 1)” Ce a 
we get 


I -( du tv +2n| u? du 

"J(u? +1)" (uw? +1)" (u? +1)**3 

a anf | Ge | du ! 
(u? +1)" (a? +1)"* (u2+1)"*1 


“Zar +2n(I,—Iy41), 


and hence 


_ 1 u 2n—1 
atl On(u*t+1)" Qn 


(7) 


This recursion formula expresses I,, , ; in terms of J, and a rational function. For 
example, setting n = 1 in (7), we get 


IL= de m. . Pen ere 
2 JGF4+t? 2412" 2 i 


setting n = 2, we find a more complicated expression for /;, and so on. 


g. Thus, finally, the indefinite integral of every rational function can be 
expressed entirely in terms of rational functions, logarithms, and arc tangents. 
For more on the practical technique of integrating rational functions, we refer to 
the literature, 


9.43. Rationalizing substitutions. Next we show how certain types of integrals 
can be “rationalized,” 1.e., reduced to integrals of rational functions, by making 
appropriate standard substitutions. In what follows, R denotes a rational function 
of one or two arguments. 


a. The integral 


[Re dx 
can be rationalized by the substitution 
e=ux=Inu,y de 

u 


b. The integral 


[ Ran x) dx 


can be rationalized by the substitution 


tan x=u, x=arc tan u, dx = ——,.. (8) 


Since the functions 


2 ] 


cos* x= ———, 
1+tan* x 


2 


sin? x = 1 —cos? x, 


2 2 


cos 2x =cos* x—sin* x, 


sin 2x =2 sin x cos x =2 tan x cos? x 


are all rational functions of tan x, the integral 


| R(sin 2x, cos 2x) dx 


can also be rationalized by the substitution (8). 


c. Unlike sin 2x and cos 2x, the functions sin x and cos x are not rational 
functions of tan x (this can be seen from the fact that sin x and cos x fail to have 
the period x of the function tan x). However, 


_ x/2) — 1 —tan? (x/2) 
~ (x/2) 


Nh 
os 
1S) 
> 
NIL 


"1+ tan? (x/2)’ 
and hence this time the integral 


[Gin x, cos x) dx 


is rationalized by the substitution 


2Qdu 


tans =u, x=2 arc tan u, ans Fo (8’) 
9.44. Integration of irrational expressions. Consider the integral 
[ (x,y) dx, (9) 


where y is some function of x. If x and y can both be represented as rational 
functions 


x = x(t),y = y(t) 


of a new variable ¢, then (9) can be rationalized in an obvious way: 
| R(s,9) de [RC s0))8"() a 


a. For example, suppose 


y= Jaxtb. 

Then 
“we 

ax +b=t1?, a. . y=t 
a 


is a rationalizing substitution, since 


a = dt 


| Rls/ax¥8) dx = ja(— t 


b. The same method works for 


_ jax+B 
- yx + 


if we choose 


ax+B 6t? —B 
=(?, xs=——,, =t 
yx +d a—yt d 


(give the details). 
c. However, the substitution 
ar t+bhxt+e=t 
no longer rationalizes the square root 
y= Jax +bete, aa 


since in this case x does not turn out to be a rational function of ¢t. Here we 
proceed differently. If a is such that b = —2aa, then 


a(x — 0)* + c— aa* = ax* + bx tc. 


Now let x — a = fu. Then, apart from a multiplicative constant, (10) reduces to 
one of the expressions 


I, 2/1 —24, Jo=J1+u%, J3= Ju? -1 
for a suitable choice of f. 


d. In the first case (», =,/1—u*), we make the substitution u = sin 6 or u = cos 
0, thereby obtaining one of the integralsy 


[Reus —u?) du= | Rosin 0,cos @)cos 0 dO 
or 
[RJ —u?) du=— | Rlco 6,sin 6)sin 6 d@ 


already considered in Sec. 9.43c. For example, 


d cos 8 d@ , 
[orca ~ | SEP = [ono naresin ure, (11) 
if u = sin 0, or 
du sin 0d0 
[arma |p |o=0 are cos u+C, (12) 


if u = cos 6.t The difference between (11) and (12) should not bother us, since 
according to the general theory, the two results can differ by a constant, and this 
is in fact the case, as shown by the formula 


arc cos u=arc sin W—- 5 (13) 
(see Sec. 5.67). 
e. In the second case (y= ,/1+u?), we make the substitution 
u = sinh 0,1 + u* = 1 + sinh? 6 = cosh? 8, 


leading to the integral 


| Rew +u*) du= | Rosinh 6,cosh @)cosh @ dé. 


For example. 


du [eee = |do—o=are sinh uc 


eee cosh 6 


where the function 8 = arc sinh w is the inverse of the function u = sinh 6. We 
can express arc sinh uw in terms of more familiar functions by solving the 
equation 


: gem—e™* | l 
u=sinh @= > = (¢- 2) 


for 0. This gives first 

e=yt+/1—w? 

(where we drop the minus sign since e? 7 0) and then 

@=are sinh u = In (u+ ./1—u?) 

f. Finally, in the third case (v3 = \/ u? —1), the appropriate substitution is 
u = cosh 6,u? — 1 = cosh? — 1 =sinh? 0, 

leading to the integral 

R(u, ,/u? —1) = R(cosh @, sinh @)sinh 6 dé. 

For example, 


[ae = Eas |40=0=arc cosh u+C, 
Ju*—-1 J sinhé 


where the function @ = arc cosh uw is the inverse of the function u = cosh @. Just 
as before, we can express arc cosh u in terms of more familiar functions by 
solving the equation 


Figure27 


for 0, obtaining first 

emu + <f u—l 

(both signs are permissible) and then 
0 = are cosh u = In (u+ Ju? —1). 


g. The graphs of the functions y = arc sinh x and y = arc cosh xf are obtained in 
the usual way (Sec. 5.35d), 1.e., by reflecting the graphs of the functions y = sinh 
x and y = cosh x in the line y = x bisecting the first quadrant (see Figures 26 and 


27). This construction confirms the single-valued character of arc sinh x and the 
double-valued character of arc cosh x (cf. Sec. 2.84). 


9.45. Elliptic integrals. In the case where the polynomial P(x) is of degree n 7 
2, the integral 
| Rts/PD) dx (14) 


cannot in general be expressed in terms of elementary functions.{ The integral 
(14) is said to be elliptic if n = 3 or n = 4 and hyperelliptic if n 7 4. The two 
most important elliptic integrals aret 


sing at ? do 
F k = 0 nn NE Se ——————————— 
( P) \, J/ (1 —t?) (1 —#?) re: sin? Q’ 


E(ko)= | a= | JT sin? 6 a0, 
0 V 1—t 0 


where 0 Sk S1 and in each case we make the simplifying assumption f = sin 0. 
For k= 0 both integrals give the same function 


? @ 
F0,)=|"d=0,  £0,0)= |" d=o. 


while for k = 1 they can be expressed in terms of elementary functions 


ine dt Jog 


sing ] 
0 ie 2 is 7 


oye ee 
0 2 l-sing 


F(Le)= | 


sing 
=sin @. 


sing 


E(1,0) = | dt=t 
0 


0 


There exist extensive tables of F(k, ) and E(k, @) as functions of the angle @ and 
another angle a, related to the “modulus” & by the formula k = sin a. The table 
below lists a few values of F(k, @) and E(k, ©). 

The behavior of F(k, @) and E(k, @) as functions of @ for various values of k 
(or a) is shown in Figure 28. Elliptic integrals come up in calculating the arc 
length of an ellipse (see Sec. 9.63c), which explains the adjective “elliptic. They 
also play a role in many other problems of analysis, and hence have been studied 
in great detail. 


30° 0.50 0.51 O51 052 052=$ 053 0.54 0.54 0.55 
45° 0.71 0.73 0.75 0.77 0.78=F% 0.80 0.82 0.85 0.88 
60° 0.87 0.92 0.96 1.01 105=F 1.09 1.14 1.21 1.32 
90° l 1.21 1.35 1.48 1.57=$ £41.69 41.85 2.16 
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Figure28 


It turns out that the general elliptic integral 
| Resn/axt + bxi+cx*+ex+f) dx 


can be expressed (to within an elementary function) in terms of the functions 
F(k, 0), E (k, ©), and the “elliptic integral of the third kind” 


d0 
\q +h? sin? 0),/1—F* sin? 6’ 


depending on two parameters / and k. + 


9.46. Other frequently encountered integrals 


a. The integral 
[Pee dx, 


where P(x) is a polynomial, can be expressed as a linear combination of integrals 
of the form 


I= [re dx. 


To evaluate J,,, we use integration by parts (with u = x”, dv = e* dx, du = nx"! 


dx, v = e*), obtaining the recursion formula 


[,=x"e—n [-te dx =x"e* —nI,,_;. 


* 


But 
I= le dx =e" +C, 


and hence J, can be expressed entirely in terms of elementary functions. The 
same is true of the integralst 


[P (x)cos x dx, |? (x)sin x dx, |? (x)In x dx. 


b. On the other hand, the exponential integral 


iF dx, (15) 
x 


the logarithmic integral 
r du 
| In u 


(obtained from (15) by making the substitution e* = w), the sine integral 


and the cosine integral 


| as 
x 


cannot be expressed in terms of elementary functions. { 
9.5. Evaluation of Definite Integrals 
9.51. Integration by parts 


a. Let u and v be piecewise smooth functions. Then, since wv is an antiderivative 
of its derivative uv’ + vu', it follows from Theorem 9.33 that 


b b b 
| (uv’ + vu’) a= | (u dv+vu du) =uv| , 


a a 


or equivalently 


b b b 
| u dv=uv -{ v du, 
a a 


a result known as the formula for integration by parts (for a definite integral), 


b. Example. Evaluate the integral 


ni 2 
I,= | sin™ x dx. 
0 


Solution. Integrating by parts, we find that 


nf2 n/2 
I= | sin" ‘x d(—cos x) = —sin™~ +x cos x 
0 


0 


n{2 
+(m—1) | sin™ 7x cos* x dx. 
Jo 


The first term on the right vanishes and hence 


n/2 
I= (m=1) | sin™ ? x(1—sin? x) dx 


which implies 


m—l 


I,=—lI,,->. 
m m m-2 
Thus, since 
x/2 oa 
if = ax= —, 
o= | dea s 
we get 
n/2 = = Bl ve Ss 
Fa sin?" x dv= 62" 1)(2n—3)---3 ln 
0 2n(2n—2)---4-2 2 


for even m = 2n. On the other hand, since 


x/2 
1=| sin x dx=1, 
0 


we get 
n/2 
Ihn41= {, nL lial 


for odd m = 2n + 1. 


_ —_ 2n(2n—1)---4-2 
~ (2n+1)(2n—1)---3-1 


9.52. Taylor’s formula with remainder in integral form 


a. Consider the integral 


[fe (b—a)" de. 


Assuming that f(x) is piecewise smooth, we integrate by parts, obtaining 


(1) 


(2) 


(b—x)"*! (b—x)"*! 


b 
[700-29 de= — = + | re =a) 
_(b-a"t? 1 ft, - 
= fa) SO | pay oat ae (3) 


Suppose f(x) has continuous derivatives up to order 1 and a piecewise continuous 
derivative f(n + 1)(x) on [a, b]. Then, applying (3) repeatedly, we get 


b b 
f(b) =f) =| f°) deaf (0-2) + | PH O—#) & 

y a ne 2 Li* ” 2 dx... 
=f (a) b—a) + 5 f°) (0— 9245 | FU) =)? d= 
=f'(a)(b—a) +5 f"(a)(b—a)? +--+ f(a) (b—a)" 

1 b 
+a [sot ()(b—a)" ae 

which is equivalent to Taylor’s formula 
fb) =fla) +f (a) (b—a) + 5 f"(a)(b— a)? +--+ F(a) (6—a)" + Ry 
with the remainder 

l b 
By a) fa) b—2)" ds 

Nida 


in integral form. 


b. Since obviously 


b 
A min porn) | (b—x)" dx 
Nacx<d a 
d 
<Ry< 7 max fO*M(8) | (b=) de 
Nracx<b a 
and hence 
(b—a)"*! 


. " b—a n+1 , 
er (0) << ee 


we immediately get Lagrange’s form of the remainder 


_ (b—a)"* 


(n+1)(, 
“em C= 2) pout (0 


(Sec. 8.23), provided that the function f” * (x) is continuous and hence takes 
every value between its minimum and maximum values at some intermediate 
point c (see Theorem 5.23). 


c. We now use these considerations to investigate the convergence of the Taylor 
series (Sec. 8.51) of the function 


fix) = (1 +x)”. 


In this case, Taylor’s formula (with a = 0, b =x) takes the form 


(L+3)*=1+arg SOO D Five eS, 
pete Fagg ora 
n: 0 


If 0 <¢<x <1, then obviously 


See ee 
l+t 1-t 


b] 
and hence 


(x— a-1 a-1 
\, one =O" +t) at <a" (1+t)*~" dt, 


while if -—1 <x <1+<0, then 


(x—2) a-1 ° (€-7)" _ ve-1 
os (1 +2) a|< | xa T) dt 


<o (i =i" |" (1+)? dt, 
0 


“a 


where x = —¢, t= —t. Therefore, if |x| < 1, 


[" [a +t)*~! dt 


Ix] 
< |x|" 1+t)*~! dt=C|x|", 
fone jn | +e F 


where C = C(x, a). It follows that 
[Rl ee aE 


Denoting the quantity on the right by w,, we get 
On +1 |x 


l 
- ade lim @at+t 


=|x|, 
QO, n+l n> On 


so that w, —> 0 as n — © if |x| < 1. Therefore the remainder R,, approaches 0 as n 
— oo if |x| < 1, and hence 


a(a— 1)---(a—a+1) 


(1l+x)*=1l+ax+ 7 


i i (4) 


a(a—I1) 2 
! 

if -1 <x <1. Unless a is a positive integer (in which case the right-hand side of 

(4) reduces to a polynomial), the series (4) diverges if |x| 7 1, by the same 

argument as in the proof of D’Alembert’s test (Theorem 6.14a). Hence the radius 

of convergence of the series (4) equals 1. 

Replacing the real variable x in (4) by the complex variable z = x + iy, we get 
the analytic continuation (Sec. 8.55) of the function (1 + x)® into the disk |z| < 1: 
a(a—1) 3 4 a(a—1): (a—n+1) 

2! 


(142)*ml+ez+——— 
n! 


Seas (5) 


Choosing a = —1 and changing z to —z, we obtain the familiar formula for the 
sum of a convergent geometric series 


] 
l—z 


(see Example 6.11b). 


8 ee Se oe ee ae ee 


d. Writing (5) in the form 


ipdtadgimaiioeet en - 


and using the fact that the series 


< a(a+1)---(a+n—1) 


2 Bpel)-(tachy (x, 8 £0, —1,—2,...) 


converges for z= 1 if 8 7 a+ 1 and for z=—1 if 8 7 a (see Example 6.67b), we 
find that (5) converges for —z = 1 if 1 7 -a + 1 and for —z =—1 if 1 7 —a, ice., for 
z= -1 ifa7? 0 and for z= 1 if a 7 —1. Then Abel’s theorem (Theorem 6.68) 
implies the interesting formulas 


a(a—1) a(a—1)(a—2) 


l—-«+ 7 31 +-+--=0 (a> 0), 

1p MARKY), MeN) (a2) oe (a>—1). 
2! 3! 

For a=n=1, 2, ..., these reduce to the familiar formulas 


Co —Ci—Cpy—++++(—1)"Cr=0, 
CotC{+CZ4+--- +05 =2” 
satisfied by the binomial coefficients 
! 
ie, 
ki(n—)! 


9.53. Integration by substitution. Given an integral 
b 
ox (6) 


where f(u) is continuous on the interval a <u S b, let u = u(t) be a function with 
a continuous derivative on some interval a S ¢ S £, where u(a) = a, u(f) = b. 
Then making a formal change of variables in (6), we get? 


b B 
| fle) du= |" fu(e))w () (7) 


We now look for conditions guaranteeing the validity of (7). It might seem 
natural to assume that u(f) is increasing on [a, 6]. However, it turns out that (7) is 
valid in the more general case where u(t) fails to be mono tonic or even takes 
values outside the interval [a, b] on which f(w) is originally defined, provided 
only that f(v) has a continuous extension onto the (in general) larger intervalt 


[A,B] = {u: u=u(t), a<t< PB}. (8) 
Thus we have the following 


THEOREM. Formula (7) holds (with a = u(a), b = u(f)) if u(t) has a continuous 
derivative on the interval [a, 8] and if f(u) is continuous on the interval (8). 


Proof. Let F(u) be an antiderivative of the function f(u) on [A, B] and let G(t) be 
an antiderivative of the function f(u(2))u'(t) on [a, 6]. Then, by Theorem 9.38, 


F(u(t))= G(d)+ C, 
and hence, by Theorem 9.33, 


io du =F (b) —F (a) =F (u(B)) —F (u(a)) 


=G(B)-G(a) = [i feuew( a 


It should be noted that in changing integrands in (7), we must also make a 
corresponding change of limits of integration. Moreover, since Theorem 9.33 is 
valid for the case a 7 b,+ it does not matter which of the limits of integration in 
either side of (7) is larger. 


9.54. Applications of integration by substitution 


a. If f(x) is defined on [a, b], then the “shifted” function f(x — A) is defined on the 
interval [a + h, b + h]. Making the substitution x — h = t, we get 


[ fena dx = [fe dt = | ft) dx. 


Similarly, the “reflected” function f(a + b — x) is defined on the original interval 
[a, b], and the substitution a + b — x =t shows that 


b 


b (‘a b 
| fatb—x dx = — ‘S) at= | F(oat= | fe) dx. 


b. rororem. The integral of a periodic function f (x) with period T has the same 
value over every interval of length T. 


Proof. Let m be the unique integer such that aSmT<a + T. Then 


a m 


[7 Aes) a= [fle at fe) 


Making the substitution x = ¢ +(m — 1) T in the first integral on the right and the 
substitution x = ¢+ mT in the second integral, and using the periodicity to write 


Kt + mT) = fit + (m— 1) T) =f) 
(cf. Sec. 8.65), we get 


at+T T a-—(m-1)T 
[fede | roan (fo a 
a a—(m-—1)T 0 

T T 

= | f0 d= | fle) ds 

0 0 
c. Example. Consider the integral 
| sin?* kx dx (a> 0, &=1,2,...). 
Using the formula 


sin kx = cos (5 -kx) 


and the substitution 


T 
~ —kx=ku, 
5 x = ku 
we get 

x nr ~ (x/2k)+x 
| sin?* kx dx = | cos?* (5 -kx) dx = | cos** ku du, 

—* — 2 (*/2k)—-x 
and hence, by the above theorem, 

| sin** kx dx= | cos** kx dx. (9) 
—-f = 


For a= 1 it follows from (9) and the obvious formula 


| (sin? kx +cos? kx) dx = | 


dx=2n 


that 
| sin? kx dx= | cos? kx dx =n. 
More generally, using formula (1), p. 309, we find that (9) equals 


(2a —1)(2a—3)---3-1 


a ae 


if a =1, 2, ... Clearly (9) vanishes if a = 4, a = 3,...(why?). 
9.55. The Stieltjes integral 


a. Definition. We now introduce an important generalization of the concept of a 
definite integral. Given two functions f(x) and g(x) defined on an interval [a, 5], 
let Il = {a = x9 Sx, < <x, = b} be a partition of [a, b] with marked points &, © 
[X,, X, + 1], and let 


d(I1) = max {AXxo, Ax; ees Ax, — pp (Ax, = Xp4 4 — Xz) 


as before. Then, introducing the Riemann-Stieltjes sum 
a= 1 
Sn( fg) = X SEs) [8 (+ 1) —g(%)], (10) 


we call the finite number / the Stieltjes integral of the function f(x) with respect 
to the function g(x) over the interval [a, b] if, given any ¢ 7 0, there exists a 5 7 
0 such that 


U- Sy(f g)| Se 


for every partition II with d(I1) < 8. In other words, the Stieltjes integral J, 
denoted by 


b 
| fx) d(x), (11) 


is the limit of the sum (10) under arbitrary refinement of the partition II. Note 


that 
b 
| de(x)=2(8)<e(0) 


for every function g(x). 


b. tHeorem. [f f(x) is continuous on [a, b] and if g(x) has a continuous derivative 
on [a, b], then the Stieltjes integral (11) exists and equals 


b 
| flxde'(x) de. (12) 


Proof. By Lagrange’s theorem (Sec. 7.44), 
Ag(xy) = 8XK +1) — SO) = BeAr, 
where c, © (x, x; + 1). Therefore 


Ss I (Sx) Ag (*%) -> I (Su) g' (cy) Ax, 
=F flade (adn “. [/E) Flas! (a)Ax 


~~ 6 


k= 
= 2 Slade’ (c,)Ax,+ = &,.2' (c,)Ax,, (13) 


where, by the uniform continuity of f(x) on [a, b], the numbers |s,| can all be 


made less than any given ¢ 7 0 for a “sufficiently fine” partition II. Thus the 
second term in the right-hand side of (13) satisfies the inequality 


n-1 
2, &8' (¢,)Ax,| <e max |g’(x)|(b—a), 


a<x<b 


and hence approaches zero under arbitrary refinement of II, while the first term 
is an ordinary Riemann sum of the function f(x)g'(x) and hence approaches the 
limit (12) under arbitrary refinement of IT. 


c. The following formulas involving Stieltjes integrals hold whenever the 
integrals on the right exist: 


b 
Loe, F(x) +g Fa()] de (2) (x) de(x) +0, | falx) de(2), 


Ss) day(s) + Ba { fle) dea), 


By 
b 
S (x) dg(x = ("A SJ (x) dg(x) + | (x) dg(x) (a<c<b), 


I(x) dg(x) )=Sadets) | 


| 
|. fx) dB ye, (*) + Boga(#)]= 
\ 
| 


b 
-| o(x) afte) 


(a1, 4, fi, B real).{ 
9.6. More on Area 


9.61. Let f(x) be a continuous nonnegative function on the interval [a, b]. Then, 
according to Sec. 9.21, the area under the curve y = f(x) from x = a to x = b 
(more exactly, the area of the “curvilinear trapezoid” ® bounded on the bottom 
by the segment [a, b] of the x-axis, on top by the curve y = f(x), and on the sides 
by segments of the vertical lines x = a and x = 5) is given by the integral 


= [fe dx (1) 


(see Figure 29). More generally, if ® is a figure made up of several curvilinear 
trapezoids with no common interior points, then the area of ® is just the sum of 
the areas of the separate trapezoids.t 

Formula (1) gives the area if fix) 2 0. If (x) <0, as in Figure 30, then J $0 
and / is the “area over the curves y = f(x) from x = a to x = b” ft taken with the 
minus sign. If f(x) changes sign in the interval [a, b], then J is the “area between 
the curve y = f(x) and the x-axis,” more exactly, the sum of all the areas lying 
under the x-axis and over the curve y = f(x) (see Figure 31). 


9.62. Examples 
a. Let 
f(x) =A+ Bx+Cx? + Dx? (2) 


be a polynomial of degree no higher than three. Then the area between the curve 
y = f(x) and the x-axis (see Figure 32) is given by 


[fo d= Z06—a)| fla) +4 ( 


Figure32 


a+b 
2 


+n | 


(3) 


or equivalently 
[- S(*) de = =e f(—0) +4f 0) +f(c)]  (2e=b-a) (3’) 


after shifting the origin to the midpoint of the interval [a, b]. To prove (3'), we 
first note that if (3') holds for f(x) and g(x), then it also holds for kf{(x) and f(x) + 
g(x), where k is any real number. Hence to show that (2) satisfies (3'), we need 
only show that the functions 


folx) = 1 fi) = x9) = RQ) =x? 


all satisfy (3'). The validity of (3’) is obvious for /o(x) and also for f(x) and f;(x), 


since both sides of (3') vanish if the integrand is odd.+ Moreover, (3’) holds for 
f,(x), since clearly 


(Sec. 9.35). It follows that (3'), or equivalently (3), holds for the function (2). 
b. Area of a circular disk. To find the area S of the circular disk 


x? + y* S?*, we note that 
s=4| ydx=4 Jr? =x? dx, 
0 Jo 


by symmetry (see Figure 33). Setting x =r cos 0, dx =—r sin 6 d6, we get 


0 n/2 pa/2y.__ 
S= —4r? sin? 0 d0—4r>| sin? 0 dO =4r? a 
J 2/2 0 


0 
n/2 


dé 


ni2 n/2 
=2r7| ao —2r? | cos 20 d0=nr?—r? sin 20 


0 Jo 


0 


Thus the number 2/2, which we have defined as the first positive root of the 
equation cos x = 0 (see Sec. 5.65b) can be interpreted as half the area of a 
circular disk of radius 1. 


c. Area of a circular sector. Let S(p, 0;) be the area of the sector of the disk x? 
+ y? S/* bounded by the rays 6 = 0 and 6 = 6,, where 0 SO SO, < 1/2 (see 


Figure 34). Then obviously 
S(O, 91) = S(O, 4) — S(O, %), 


so that we need only calculate S(O, 65) for arbitrary 4 © 10, 2/2]. It is clear from 
Figure 35 that 


J 


rcos 89 


Wien! Jr? —x? dx 


r cos @o 


‘2 
a 
1 


rcos 69 60 
tan 6 +17 sin? @ dO 
0 


= 20 
7° 2 


d6 


80 


2 
22 aig Oy+r? 


a 
seit 6, tan i; 4+— 0 _- sin 20 


0 


Pe ‘a 
sin 09 cos fete at? - sin 05 cos Ay = 5 90: 


oh 


and hence 


5 (80,81) => (8; —9o). (4) 


“im 
2 
It is easy to see that (4) is actually valid for any circular sector of central angle 0 
$0, — 0) ©2z. 

d. Suppose we subject a curvilinear trapezoid ® to a k-fold expansion along the 
y-axis, as Shown in Figure 36. Then the area S, of the resulting figure is & times 
the area S of the original figure, since obviously 


Figure36 
b b 
5S,= | kf (x) isk | S(x) dx =kS (5) 


(cf. Sec. B.6, p. 486). The same is true if we subject ® to a k-fold expansion 
along the x-axis, as shown in Figure 37, since then 


kb x b b 
s,= | f(z) ame | se armel fis) dems (5’) 
ka a 


(make the substitution x = kt). 


e. Area enclosed by an ellipse. The ellipse 


x? y? 
+2,=1 
a bf 


is obtained from the circle x* + y? = 1 by an a-fold expansion along the x-axis 
and a b-fold expansion along the y-axis (see Figure 38). Since the area enclosed 
by the circle equals 2, it follows from (5) and (5') that the area enclosed by the 
ellipse equals ab. 


Figure38 


Figure39 


f. Geometric meaning of the inverse hyperbolic cosine. Consider the problem 
of finding the area of the sector ABC shown in Figure 39, where C lies on the 


right-hand branch of the hyperbola x” — y* = 1. This area is given by the integral 
I= | Ju?—1 du, 
1 


which we evaluate by making the substitution u = cosh ¢ (cf. Sec. 9.44f): 


x arc cosh x 
=| ViF=T du= | sinh? ¢ dt. 
1 


Q 


But 

oo cosh 2t—1 
9 5 

since 


cosh 2t = cosh? ¢ + sinh? t,1 = cosh? tf — sinh? 


(cf. Sec. 8.74), and hence 


are cosh x h 24—]1 sinh 2t t 
= east “ 
a 2 4 2 


arc cosh x 


0 
Moreover, 

sinh 2#=2,/cosh? t—1 cosh f, 

since 

sinh 2¢ = 2 sinh ¢ cosh ¢, 

and hence 

I=1(x,/x? —1—arce cosh x) =4(x,/x? —1—4 arc cosh x). 


The first term is the area of the triangle OBC. Therefore + arc cosh x is the area 
of the sector OAC, so that 
S=arc cosh x (6) 


is the area of the sector OC'AC shown in Figure 40. Inverting (6), we get the 
parametric representation of the hyperbola 


x=coshS, y=,/x*—1=sinh S. (7) 


Naturally, it is the interpretation of the parameter S as the area of the sector 
OC'AC determined by the variable point C that is at issue here, since the mere 


fact that (7) is a parametric representation of the hyperbola x* — y* = 1 is 
immediately obvious. 


It is interesting to compare (7) with the parametric representation 


x =cos #,y = sin 0 


Figure41 


Figure42 


of the circle x* + y* = 1. Here @ is the central angle AOC shown in Figure 41, or 
alternatively, because of (4), the area of the sector OC’AC. Thus the area of a 
variable sector can be chosen as the basic parameter both for the circle and for 
the hyperbola. 


g. Young’s inequality. Let 
y=flxy(0 SxS 0) 


be an increasing continuous function such that (0) = 0, f(o) = 0%, with inverse 
function 


x= g(v)(0 Sx S00), 


which is itself increasing and continuous (see Theorem 5.35b), and let 


° 


Fa)= |) &, G()= | a(n) a (x20, y20). 


Then it is geometrically clear from Figure 42 that 
xy SF (x) +G(y) (8) 


for all x 2 0, y 2 0, a result known as Young’s inequality. Note that the 
inequality becomes an equality if and only if y = f(x).t 

Young’s inequality is the starting point for a number of other important 
inequalities. For example, (8) implies 


P q 
P 4 


if we choose 

ya ly = ye D 

and 

xy $+ 1 In@+1l)—-x+e-y-1 

if we choose 

y=Inx+ 1)x=e’-1, 

where both inequalities hold for all x 7 0, y 2 0. 

9.63. Area in polar coordinates. Given a curve with equation 
r=) 7 0a $0 SB) 


in polar coordinates, we now examine the problem of calculating the area S' of 
the “curvilinear sector” OAB bounded by the two rays 0 = a, 0 = # and the arc of 
the curve between the rays (see Figure 43). Our approach is the exact analogue 
of the treatment of the area of a “curvilinear trapezoid” given in Sec. 9.21. 
Consider a partition 


Il = {a= 9) © @ £4, <9, £0, &...€G,_, So, £4, = Bh, 
where the point ¢, © [0,, 0; ;] is chosen so that 


S(%)=M,= max f(6). 


6.505041 


Then the corresponding Riemann sum of the function 
| 
(9) =5 f7(6), 


equal to 


nie ] n-1 
Sn(g) = 9 » M;A0, 


Figure44 


(Ad, = & + 1 — 9), can be interpreted as the area of an “elementary figure” 
containing OAB, made up of n “circumscribed circular sectors” like the sector of 
radius M, shown in Figure 44. The use of g(@) instead of the original function 
f(@) is due, of course, to the fact that the area of a circular sector of radius r and 


central angle A@ is given by 472 Ad and not by rA@, as shown by formula (4). On 
the other hand, if we choose @; so that 


S(M)=m= min (4), 


6xS9SOK+1 


then the corresponding Riemann sum of g(), equal to 


n=l 


] 
Sn(g)=5 7. m, AQ, 


can be interpreted as the area of an “elementary figure” contained in OAB, made 
up of n “inscribed circular sectors,” like the sector of radius m, shown in Figure 


44. Therefore any number S which can lay claim to be a value of the “area” of 
the figure OAB must satisfy the condition 


Si(g) SS SSy(g). 


But (A) is integrable, being continuous, so that S,, (g) and S,, (g) approach the 


same limit, equal to the integral of g(9) over [a, /], under arbitrary refinement of 
the partition IT. It follows that 


S= ; [ £?(0) d0 (9) 


is the only possible value of S. The existence of S is established in the same way 
as for the case of rectangular coordinates. 
As an example of the use of formula (9), consider the curve 


r=asin nO>0 (n odd), (10) 


called an n-leaved rose (shown in Figure 45 for the case n = 3). Since 


r = asin 30 


Figure 45 


n/n nr 
: a’ sin? n@ d= id a’ sin? p dp 
2 0 2n Jo 


(make the substitution n@ = ¢), the area enclosed by one leaf of an n-leaved rose 
is 1\7 times the area enclosed by the “single-leaved rose” 
r=asin 0 (11) 


But (11) is simply an ordinary circle of radius a/2. It follows that the area 


enclosed by all n leaves of the curve (10) is precisely the area ma’/4 enclosed by 
the circle (11). 


9.7. More on Arc Length 


9.71. Length of a parametric curve. It will be recalled from Sec. 9.22 that the 
length of the curve 


y=flx)(a Sx $b), 


defined as the limiting length of a polygonal line inscribed in the curve as the 
length of the segments making up the line approaches zero, is given by the 
integral 


b 
=| JTF @E dx (1) 


(provided the integral exists). In the general case, a curve L is described by 
parametric equations, specifying the coordinates of a variable point of L as 
functions of some parameter ¢. There are two such parametric equations 


x= x(t),y = (a St Sb) 
in the plane, three equations 
x= x(d),y = (A),z = 2()(a St Sb) 
in three-dimensional space, and more generally, n equations 
¥, =x,(t), X%_mX_(t),..., x, =x,(t) (a<t<b) (2) 


in n-dimensional space. We now derive a formula for the length of such a 
“parametric curve.” 


THEOREM. Let L be the curve with parametric equations (2), where every function 
x(t) is piecewise smooth on [a, b].+ Then the length of L, defined as the limiting 


length of a polygonal line “inscribed” in L, is given by the formula 


"7 i aman 
=] * [xi(t)]? dt. (3) 
Ja j=1 
Proof. Let 
I= {a= % St Be Be = 5} 


be a partition of [a, b], where each parameter value t, leads to a point P; on the 
curve L, with coordinates x, (t,), x> (t,), ..-, X,(t,). Joining consecutive points P, 
with “n-dimensional line segments,” we get a polygonal line Ly = PoP, ... P,, 
“inscribed in L,,” of length 


| i 5 
s(Ly) = > p> [x;(te+1) —¥)(4)]?- 
k=0W j=1 
Since the functions x,(t) are all piecewise smooth, we have 


$)(teas) —8)() = ("50 dt 
- [x49 dt + \- [x5 (t) —*5(t,)] at 


tk tk 
kStsStk+t 
where At, = ty , ; — 4 and we interpret x'(t,) as x(t, + 0) if % is a discontinuity 
point of x’(¢). Therefore 
IX ;(tha1) —%j(4) —45(4,) At] Se, At, 


where 


Ej,= max |xj(t) —x5(%)|- 
MesStsSte+1 


But then 


2 At, 


2 [x(t +1) —x,(t,)]? -/¥, [x5 (4) 7 Ag, < 


(see Sec. 3.14b), which implies 


a> [x j(ty4 1) —¥j(4)] -’y 2 [x5( F, wswrraa| 


p-i 
<> FS Fa At,, 
k=0Y j=1 


or equivalently, 


s(n) / baled? At 


We can write the sum on the right in the form 


bl ey, At, 
j=l 


hs z e%, Ai, + 3." | > e%, At, 


where the first sum consists of the terms stemming from intervals on which all 
the x,(¢) are continuous and the second sum consists of the terms stemming from 


intervals containing a discontinuity point of at least one x’(¢). Since every x'((Z) is 
uniformly continuous on all the intervals [¢,, t, , ;] involved in the first sum, we 
have 


é4 S@(d(D)( = 1, 2, ..., 2) 


for all such intervals, where @(u) approaches zero as u — 0 and d(II) has the 
usual meaning, i.e., 


d(I1) = max {Ato, At, eatery A a it. 


It follows that 


s(n) - |S (ile) PAa| </alb—a)o(d()) +2MlRD"An, (4) 


where 

M= sup |x;(t)| 
a<st<b 
1<j<n 


and }" At, is the sum of the lengths of all the intervals [t,, t% + 1] containing 
discontinuity points of at least one of the x'(7). There are a fixed number of such 
points, and hence })” At,-+0 as d(I1)-+0. Hence, passing to the limit as d(Il) > 
0 in (4), we find that 

_ 


i- 
lim s(L_)= lim Y [xi(4)]2Ag. 
oy j=1 


4(11)+0 4(11)+-0 k= 


But this proves (3), since the quantity on the left is just the length s of the curve 
L, defined as the limiting length of the inscribed polygonal line Ly, as d(II) — 0, 


while the quantity on the right, involving Riemann sums of the function 


| ¥ [x30], 
j=1 


is just the integral 


. ee 
(Evo 


(which exists since the integrand is piecewise continuous, like the functions x’(t) 


If n = 2, formula (3) reduces to 


b 
=| JF OlrtD Or a (3’) 


(x, =x, X, = y). Choosing x as the parameter, we get formula (1). 


9.72. Examples 


a. It follows from (1) that the length of the catenary y = cosh x between the 
points with abscissas x = 0 and x = bis just 


b b 
=| JT Fain a= | cosh x dx =sinh 6 
0 0 


(see Figure 46, noting that the length of the curve AB is the same as that of the 
segment CD). 
b. Using (3'), we find that the length of the arc of the circle 


x=rcos 6y=r sin 8, 


Fionredfé 


Bz sAguzwiv 


bounded by the rays 6 = 6, and 6 = 45 (0 $4, < 6, S22) equals 
62 

$= | ./r? sin? 6+r? cos? 0 dd=r(0,—9,). 
61 


In particular, choosing 0, = 0, 0, = 2a, we find that the circumference (length) of 


the whole circle is just 2ar. This justifies the familiar interpretation of the 
number 7 as the ratio of the circumference of a circle to its diameter. 


c. The length of the arc 
x=acosty=b sin (0 St ST S22, a <b) 


of an ellipse of eccentricity 


is given by 
T T 
=| fs sin? t+b? com at= | a” sin’ t+67(1—sin? t) dt 
0 0 
T T 
-| /b? — (6? —a?) sin? dt—b| J1—e sin? t dt=bE(e,T), 
0 0 


in terms of the elliptic integral FE (¢, 7) introduced in Sec. 9.45. 
d. The length of a “quarter cycle” of the sinusoid y = sin x is also given by an 
elliptic integral, since 
n/2 n/2 
s= | J 1+ cos? x dx= | /2=sin? x dx 
0 0 
nj2 
0 


=J2| afl —4t sin? x ds= /28(—4,5). 
V 


This is hardly surprising, since the ellipse obtained by cutting an oblique cross 
section of a circular cylinder becomes a sinusoid when the cylinder is “unrolled” 
onto a plane. 


e. Unlike the case of area, the length of a curve does not in general transform in 
a simple way under an expansion along one of the coordinate axes. In fact, even 


in the case of a polygonal line, each segment of the line is multiplied by its own 
coefficient under such an expansion, where the coefficient depends on the angle 
between the segment and the direction of the axis in question. However, if we 
carry out a k-fold expansion along all the coordinate axes simultaneously, 
thereby making a similarity transformation (see Sec. 2.67), then obviously the 
length of each segment of every polygonal line is multiplied by k, and hence so 
is the length of every piecewise smooth curve. 


9.73. Arc length versus chord length. Let sg denote the length of the chord Lp 
joining the end points of a piecewise smooth curve L, and let Ly = PoP, ... P,— 1 


be any polygonal line inscribed in ZL, as in the proof of Theorem 9.71. Then, by 
the triangle inequality (cf. Theorem 3.1 1a), 


s(Ly) 2 SQ» 
where s (Ly) is the length of Ly and sp is the length of Lp. Taking the limit as 


d(I1) — 0, we find that 


S 2 SQ> 
or equivalently 


—>l, 
So 


where s is the length of the curve L. 


THEOREM. /f's is the length of the curve L with parametric equations (2) and if So is 
the length of the chord joining the end points of L, then 


lim—=1, 
ba 50 
provided that 


[x5 (a)]? #0. (5) 


1 


TMs 


Proof. Since b will eventually be made to approach a, there is no loss of 
generality in assuming that every x'(t) is continuous on [a, b].+ Using (3), we 
have 


(O-a) /EKOr | Y KOP 
= = od a ae (6) 


Y bear 


j=1 V 
for suitable € ¢), ..., ¢, all belonging to the interval (a, b). This follows from 


Theorem 9.15f in the case of the numerator and Lagrange’s theorem (Sec. 7.44) 
in the case of the denominator. But the right-hand side of (6) clearly approaches 
1 as b a, provided that (5) holds. 


9.74. Arc length as parameter. Again let L be a piecewise smooth curve with 
parametric equations (2), and let s(t) be the length of the arc of L joining the 
initial point of Z (with parameter value a) to the variable point 


P(t) = (x1, (D), X) OD, «.-5 X(O)(a St Sb). 
Then 


st= [)/ 3 leP ae 7 


where s(f) is itself piecewise smooth, with derivative 


H=/¥ LP. (8) 


j=i 
Clearly (8) is nonnegative and continuous everywhere except at the discontinuity 
points of x’(7). It follows that s(t) is nondecreasing on [a, 5]. 

Now suppose the functions x’, (2), ..., x',(¢) do not vanish simultaneously on 
[a, b], so that s(t) is increasing.t Then s = s(t) has a continuous increasing 
inverse ¢ = ¢(s) on the interval [0, /], where / is the length of L, (see Theorem 
5.35b). Moreover, let [a, 8] = [a, b] be an interval on which the functions x'{t) 
are all continuous. Then, by Theorem 7.16, ¢ = ¢(s) has a derivative on [s(qa), 
s()| equal to 


rte Yl 
oor 


Hence ¢(s) is piecewise smooth, like s(¢) itself (why?). But then every function 
x(t) can be expressed as a continuous function of s, with a piecewise continuous 


derivative. In particular, if the arc length s is chosen as the original parameter 1, 
it follows from (8) that 


y [xi(s)?=1 (xsi). 
j=l 


Points at which all the functions x'(¢) vanish simultaneously are called 


singular points of the curve L. Thus we see that a parametric representation of L 
with the arc length as parameter is possible on parts of the curve which contain 
no singular points. 


9.8. Area of a Surface of Revolution 


By a surface of revolution we mean the surface QO generated by rotating a curve 
L about a straight line lying in its plane. For example, rotating the curve with 
equation 

y=f(x)20 (agx<b) (1) 
about the x-axis generates the surface of revolution Q shown in Figure 47. More 
generally, rotating the curve with parametric equations 


x=x(t), y=y(t)20 — (a<x<b) (2) 


Figure 47 


Figure48 


about the x-axis also generates a surface of revolution Q. We now consider the 
problem of defining and calculating the area of Q. 


9.81. Area of a cone. We begin with the case where the curve (1) is a straight 
line with equation 


ee (3) 


Rotating the line (3) about the x-axis then generates the surface of revolution QO 
shown in Figure 48, 1.e., a conical surface (or simply a cone). The line (3) is 
called the generator of the cone, the point A in which the generator intersects the 
x-axis 1s called the vertex of the cone, and the circle I obtained by rotating the 
point B about the x-axis is called the directrix of the cone. 

To define the “area” of the cone Q we argue as follows. Choose consecutive 
points B,, ..., B, on the circle I’, and then join each point to the next (and the 


point B,, to the point B,), thereby obtaining a polygonal line 4 = B,B, ... B,B, 
and a set of isosceles triangles B,AB, ..., B, AB, which together make up a 
pyramidal surface ®, “inscribed” in P, as shown in Figure 49. The area of ®, is 
defined in the obvious way as the sum S (@,) of the areas of the triangles B, 
AB), ....,B, AB. Let the length of the segment B,B; ,, be 26, (A= 1, ..., n), and 
let 


d(A) = {8), ..., 8,}- 


Defining the area of the cone Q as the limit 
S= lim S(®,), (4) 


d(a)+0 


we now derive a formula for S (this incidentally proves the existence and 
uniqueness of S): 


Figure49 


THEOREM. The limit (4) equals alr, where | is the “slant height” of the cone and r 
the radius of its base.* 


Proof. The altitude of the triangle B,AB, , ; equals 
——=3 Of O,E OE 
P-§= /1—-f~=l( 1—-—+, )=/-—, 

i | P ( a 2 


where EF — 1 as 6; — 0 (see Theorem 5.59e). Hence the area of B,AB, , ; equals 


— 6pE 
b/P -54=5!- 7; 
and the area of the inscribed pyramidal surface ®, is just 
ORE l le 
S(@,) = bf — 2) = —ls(4)—=— ¥, SE 5 
()=¥ (5l- FF) = 5 u— 5B RE (5) 


where s(A) is the length of the polygonal line A = B,B, ... B,B,. As d(A) — 0, s(A) 
approaches the circumference 2zr of the circle F (Example 9.72b), and hence + 
Is(A) — alr. Moreover, if the 6; are all less than ¢, then 


| ] 
—_ YV SE< —2¢? 6 
oe, wE< 5 26°s(A) (6) 


(for sufficiently small 6,), so that (6) approaches zero as d(A) — 0. It follows that 
(5) approaches z/r asd) 0. Of 


9.82. Area of a conical band. Let © be the “conical band” shown in Figure 50, 
generated by rotating a segment of the line (3) about the x-axis. Then we define 
the area of Q in the natural way, 1.e., as the difference between the area of the 
“larger cone,” of slant height /, and base radius r, and the area of the “smaller 


cone,” of slant height /, and base radius r>. By the above theorem, 


Figure50 
S= mr} = lr). 


But, by similarity, 


or equivalently, 

hry = lyr, 

and hence 

S= al, — b)\r, + 12), 


or finally 
S=nl(r, +12), (7) 


where / = /, — /, is just the “slant height” of the conical band Q. 


9.83. Area of a general surface of revolution. Let © be the surface of 
revolution generated by rotating the curve L with parametric equations (2) about 
the x-axis, and let L,, be the polygonal line inscribed in L, corresponding to the 


partition 

I= {a=ty St, ... &t, =d}, 

with parameter 

d(II) = max {Af, At), ..., At, — }(At, = t+ 1 — tp). 

Let Q7, be the surface of revolution generated by rotating Ly about the x-axis. 


Since Q,, is made up of conical bands, we define its area S(Qy,) in the obvious 


way as the sum of the areas of these bands. Defining the area of © itself as the 
limit 
S= lim S(Q,), (8) 


d(fl)-0 


we now derive a formula for S (this incidentally proves the existence and 
uniqueness of S). 


THEOREM. Zhe limit (8) equals 
S=2n | y(t) JOP + Dy OP at (9) 


if x(t) and y(t) have continuous derivatives on [a, b]. 

Proof. The vertices of Ly are at the points P;, = (x;, y,), where 
X= ACL) V ie = WGMA = 0, 1, +5), 

and the length of the segment P,P; , ; 1s just 

Al, = (Ax,)? + (Ay,)?. 


Rotating P,P; +, about the x-axis generates a conical band of area 


Vit Ver DAL, 


by (7), and hence the area of Qy is given by 


n-1 
Sp =S(Qq) = nd (He + De 1) AG 
Using Lagrange’s theorem (Sec. 7.44), we can write this expression in the form 
n—-l 
Tat (E\12 2. fF affee \12 
Sa=" Y L(t) +9 (tes IVE CE)? + 0A, 


where the points ¢, and 7; both lie in the interval (¢,, t, , ,) but are in general 


distinct. 
Besides Sj, we also consider the expression 


Si= 20D y(t) / OT FD WT te 


which is clearly a Riemann sum of the continuous function 
2ny(t)./[x’()]? + L9(¢)]? 


(it is assumed that x'(4) and y(t) are continuous), and hence approaches the right- 
hand side of (9) as d(II) — 0. Thus the proof of (9) will be complete if we 
manage to show that 


lim (Sq—Sf) =0. (10) 
This is done as follows. Given any ¢ 7 0, there is a 6 7 0 such that |t — r*| <6 
implies 
bk’) — x) Sely(t- VC) Seb - We) Se 
(why?). Hence, in particular, d(I1) < 6 implies 
D(te+ 1) =(ty) + O18 (18,|<1), 
w'(Cx) =*'(H)+02€ — (02|<1), 
I (m) =I(K)+O3¢ — (|83| <1), 
so that 


Ly (te) +9 (tes DIVE (EP +9 (nd)? 
=[29(te) + 9,6], / [x (4) + 926]? + [y(t + O36]? 
=[29(h) + O,€)/ 0% (4)1? +19’ (4) 1? + One, 


where 


|O4| <2max {|x’(t)| +] 9’(¢)| +6}. 
acst<b 


Moreover, 

Ja—Jb<Jatb<Jat+ Jb 

for arbitrary a, b 7 0,+ and hence 

Jatb=/at+05/6 — (\05|<1). 

It follows that 

JEEP + OG)? + e=V Ee QP +B Q)7? + Os/Oa8, 
where |6.| < 1. Thus, finally, 


n—-1 ——————— 
Sa—Sh=7 Y (294) + Orel EG)? +L GP +Os/ 48] 
n-1 ee eee 
~ 2m YY y(te)/ le’ ()] *+[y'(t,)] 
n-1 
=F [Ose /E (4)? + G)P? +209 (4) Osy/Oa8 + 20, O56,/ 048], 


where the right-hand side clearly approaches zero as d(II) — 0, thereby 
establishing (10). 


In the case t = x, formula (9) reduces to 
6 
S=2n| 9(0)/TH Pde 0’ 


9.84. Examples 
a. Rotating the semicircle 
x=rcos 6y+rsin 0 <6 Sz) 


about the x-axis generates a sphere of radius r (see Figure 51). It follows from 
(9) that the area of the sphere is 


r sin @,/r* sin 0+1r7 cos* 0 d@ 


2ar?| 


=4nr?, 


ft 
0 


sin 0 d0 = —2nr? cos 0 


n 
0 


Figure51 


Figure52 


b. Rotating the catenary 


y=cosh x(- a Sx Sa) 
about the x-axis, we get the surface of revolution shown in Figure 52, called a 
catenoid. Using (9’), we find that the area of the surface is 


a 


s=22\ cosh x/l +sinh? x dx=2n\ cosh? x dx 


=2n| cosh = ome (n= 2x +1) 


=n(sinh 2a+ 2a). 


a 


9.9. Further Applications of Integration 


9.91. We begin with an almost trivial observation. Let v = v(x) be a function 
defined on an interval [a, b], and suppose we know the principal linear part (or 
differential) 


dv =f(x) dx (1) 
of the increment 
Av = v(x + dx) — v(x)(dx = Ax) 


for all x, x + dx © [a, b]. Then one value of v, say v(f), is related to another value 
of v, say v(a), by the formula 


B 
u(B) =0(a) + | fs) ds (2) 


(a <a, 6 <b). To see this, we need merely recall from Sec. 7.31 that f(x) is just 
the derivative of v’(x). 

In some cases where it is hard to find (1), we can establish an inequality of the 
form 


dv < f(x) dx (1’) 
instead. Integrating (1'), we then get the estimate 
7) 
v(B) <v(a) + | fle) de 2) 


In finding an expression for the differential dv (as opposed to the increment Av), 
things are often simplified by the fact that quantities “of a higher order of 


smallness than dx” can be neglected (why?). 
9.92. Examples 


a. How much work is needed to lift a mass m completely out of the earth’s 
gravitational field? 


Solution. Let W = W(h) denote the work needed to lift the mass to a height h 
above the earth’s surface. According to Newton’s law of gravitation, the earth 
attracts a mass m at a height h above its surface with a force 


m 
F=F(h)=C (Rh? 
where R is the radius of the earth (approximately 4000 miles). To determine C, 
we note that the gravitational force acting on a mass m at the earth’s surface (h = 
0) is just mg, where g is the “acceleration due to gravity” (approximately 32 
ft/sec”), and hence C = Rg. Suppose we lift the mass from a height / above the 
earth’s surface to a height h + dh. This requires an amount of work equal to 


(3) 


provided F' does not vary over the interval [h, h + dh]. Actually, of course, F 
does vary over [h, h + dh], but only by an amount of a higher order of smallness 
than F' dh, which can therefore be neglected in writing the differential (3). To 
find the amount of work needed to lift the mass m from the earth’s surface to a 
height H, we now apply (2), obtaining 


1 a i, R?mg d _ R?mg|° _ _ R?mg 
~ Jo (R+h)? R+hlu 


The second term on the right is negligible compared to the first for very large H, 
corresponding to the fact that increasing / further requires only a negligible 
amount of extra work against the force of gravitational attraction. Hence, finally, 
the amount of work needed to lift m “completely out of the earth’s gravitational 
field” is just Rmg. 


b. A funnel of height H is filled with water. Let the top of the funnel be of radius 
R and area S, while the bottom (i.e., the hole) is of radius r and area s. Suppose it 
is known that water leaves the funnel with velocity 


v(h) =,/2gh 


at the moment when the height of the free surface of the water above the hole is 
h. Find the time 7 it takes for all the water to flow out of the funnel. 


Solution. Let the height of the water at time ¢ be h = h(t), and let S(h) be the area 
of the cross section at height from the bottom (see Figure 53). During the 
interval [¢, ¢ + dt] an amount of water of volume 


vs dt= \/2ghs dt 


leaves the funnel. (Here we assume that v is constant during the interval [¢, ¢ + 
dt], since taking account of the actual variation of v leads to a “correction” of a 
higher order of smallness than dt.) This is an amount of water equal to the 
volume 


S(h) dh 


of a circular cylinder of height dh and cross-sectional area S(h). (Again the 
actual volume of water is that of a conical frustum, but this differs from the 
volume of the indicated cylinder by a quantity of a higher order of smallness 


\esurs/ 


h 


Figure53 

than dh.) It follows that 
S(h) dh=,/2ghs dt 

or 


S(h 
s dt= er dh. (4) 


Substituting 


S(h) = alr + (R - r)h/HY 


into (4) and integrating from 0 to 7, we get 


T H 2 R—r)Al/2 R—r)2}3/2 
{dest= | | at Ee |e 
r7hil2 Or(R—r)h3/? (R—r)?h5/? 
3 — a 
2 2 2 


H 


0 


(cf. Theorem 9.53), so that finally 


H 4/R 2/R 2 
Pi hod Bde al ee ict 1 L, 
=| Ze )+iG 


If R is very much larger than r, it is a good approximation to write 


9.93. The neighborhood of a plane curve. The set V,(L) of points belonging to 


all closed disks of radius p with centers on a given plane curve L is called a p- 
neighborhood of L (see Figure 54). Let L be piecewise smooth, and suppose L 
has no singular points in the sense of Sec. 9.74. Then it can be shown that V,(L) 


has area (cf. p. 486). 


Figure54 


A 
Figure55 
tHEorEM. Let S,(L) be the area of the set V,(L). Then 

S,(L) <2npl+zxp’, (5) 
where | is the length of the curve L. 


Proof. As in Sec. 9.74 we choose arc length along L as the parameter (s varies 
from 0 at the initial point A to / at the final point B). Let S,(Z,) be the area of a p- 


neighborhood of the arc L, — L corresponding to parameter values in the interval 
0 Ss So and let K, be the disk of radius p centered at the point C with 


parameter value s = o. The possible positions of points C’ © L with parameter 
values in the interval o Ss So + ds all lie in a disk of radius ds centered at C, 
while the possible positions of points of the disks of radius p centered at the 
points C’ all lie in a disk K, , g, of radius p + ds centered at the point C (see 


Figure 55). Hence the increment of S,(Z,) in going from the parameter value o to 
the value o + ds cannot exceed the difference between the areas of the disks K, , 
ds and K,, 1.e., the quantity 
m(p + ds)? — mp” = 2ap ds + x(ds)’. 
Thus the principal linear part of the increment satisfies the inequality 

dS,,(L,) <2np ds. (6) 
Integrating (6) with respect to s from 0 to /,f we get 
S(L) ~~ S,(Lo) S 2mpl, 


which implies (5), since S(Ly) =p’. fl 


9.94. Volume of a solid with horizontal cross sections of Known area. It 
follows from the general theory of “measure” that the volume of 


Figure57 


a right cylinder of altitude H and base of area S is just SH (see Figure 56).+ Let V 
be a solid whose horizontal cross section at height h above its base, denoted by 
V,, is of area S(h), as shown in Figure 57.t Suppose the boundary of V;, varies 
continuously with the height h, in the sense that given any ¢ 7 0, there is ad 7 0 
such that the projection of the boundary of the cross section V;, onto the plane of 
the base lies in an e-neighborhood of the projection of the boundary of the cross 
section V, onto the same plane whenever |h’ — h| < 5. But then the absolute value 
of the difference between the areas of the two cross sections V, and V;,, does not 
exceed the area of the indicated e-neighborhood, and hence, by Theorem 9.93, is 
bounded from above by 2zeL, where L is an upper bound for the lengths of the 
boundaries of all the V,. 


Now let v(h) denote the volume of the part of V between the base and the 
horizontal plane at height / over the base. Then the principal linear part of the 
increment of v(4) when the argument changes from h to h + dh can be written in 
the form S(h) dh, precisely as if V were cylindrical in the interval [h, h + dh]. In 
fact, as just shown, the actual increment in volume differs from S(h) dh by a 
quantity of a higher order of smallness than dh. It follows that 


dv = S(h) dh, 


and hence, integrating from 0 to H, we find that the volume of V is just 


y=v(H) = (sm dh. (7) 


9.95. Examples 


a. Let V be a solid cone of altitude H and upper cross-sectional area S (see 
Figure 58). By similarity, the cross section at height / has area 


Figure59 


: Pe he 
s(t) =s(5) ; 


It follows from (7) that 


H 34H 
= | S Rd Sh l 


——s =—,—| =—SH. 
o H? HW? si, 63 


b. Let Fbe a solid sphere of radius R. Then, as shown in Figure 59, the area of 
the cross section at height / (above the center) equals 2(R* — h), and hence, by 


(7), 


; =?2nr pk oi “ei. 
0 3 3 


c. For a solid of revolution, (7) takes the form 


\ 3 
v=2| n(R? —h?) dh=2n( R=) 
0 


H 
v=n| r*(h) dh, (7’) 


0 


where 7(/) is the radius of the cross section at height h. 


d. Let V be the solid cut out of a sphere of radius R by the paraboloid of 
revolution 


2Qaz=x* +y?, (8) 


with vertex at the center of the sphere (see Figure 60). According to (8), the area 
of the cross section of the paraboloid at height h (above the center of the sphere) 
is just 2zah, while, on the other hand, the area of the cross section of the sphere 
at height h is 2(R* — h), as in the preceding example. The sphere and the 
paraboloid intersect in the plane z = z), where 


= 
2azp=R ™ O° 


Figure60 


Hence 
Zo= —at,/a?+R?, 
so that the volume of V is 


Zo R 
v= | 2nah dh+ m(R? —h?) dh 
0 JZ 


(as an exercise, evaluate these simple integrals). 


e. If the solid V is expanded k times along some axis, then its volume is enlarged 
k times. This is proved in much the same way as in the plane case (cf. Example 
9.62d). In particular, we see that the volume of an ellipsoid with semiaxes a, b, 
and c equals $zabc, since a sphere of radius 1 has volume 42 and the ellipsoid is 
obtained by expanding the sphere a times along the x-axis, b times along the y- 
axis, and c times along the z-axis. 


9.10. Integration of Sequences of Functions 


9.101. Let 
f(x) fox), «- 5 f(x), ...(a Sx SD) 


be a sequence of integrable functions converging everywhere on [a, b] to a limit 
function f(x). Then two questions arise naturally: 
(a) Is the function f(x) also integrable? 


(b) If so, does 


b “b 
J (x) dx=lim | f(x) dx? (1) 


a no Ja 


~ 


In general, the answer to both questions are negative, as shown by the following 
examples: 

(a) Let f(x) equal 1 if x © 10, 1] is of the form p/g with g <n, and let Fx) = 90 
otherwise. Then every f(x) is nonzero at only a finite number of points, and 
hence f(x) is integrable with integral 0 (see Sec. 9.16d). The limit f(x) of the 


sequence f(x) equals 1 at every rational point of [0, 1] and 0 at every irrational 
point of [0, 1], 1.e., the function f(x) is the Dirichlet function, shown to be 
nonintegrable in Sec. 9.17. 

(b) Let 


n sin nx if0<x<x/n, 
S,(*) = 
0 ifn/n<gx<n. 


Then f(x) approaches 0 at every point x © [0, z], and hence the limit function 
fix) is obviously integrable with integral 0. But 


5d n/n 
[5 a= | n sin nx dx=2 (n=1,2,...), 
) 


0 
so that (1) fails to hold. 


9.102. Thus an affirmative answer to both questions posed above can only be 
expected if extra conditions are imposed on the nature of the convergence of the 
sequence f(x). As we will see in a moment, such an extra condition is afforded 


by the requirement that the convergence of f(x) to its limit f(x) be uniform. It 
will be recalled from Sec. 5.93b that a sequence of functions f(x) is said to 
converge uniformly on [a, b] to the limit function fix) if, given any ¢ 7 0, there 
exists an integer N 7 0 such that 


Ix) — f,(a)| Se 
for all n 7 N and all x © [a, 5]. 


THEOREM. Let f,(x) be a sequence of functions integrable on [a, b| which 
converges uniformly on [a, b] to a limit function f(x). Then f(x) is also integrable 


on [a, b]. Moreover, the formula 


Q é 
lim | STn(x) dx = | S(x) dx (2) 
holds uniformly for all & © [a, b], and in particular 

b b 
lim | (0 dx = | S(x) de 


nc 


Proof. For arbitrary partitions IT and IT’ of the interval [a, b] and every n = 1, 2, 


., we have 
Sa(S) =Sn( Sn) +5n(f—Sa) (3) 
Saf) =Sa( Sn) +5a( SS); (4) 


where S7(f), Sy(f,,), -.. are Riemann sums of the functionsas in Sec. 9.12. Given 
any € 7 0, let n be such that 


fe) hl <a 


for all x © [a, b]. Then, obviously, 

ISn(S—-Al<3s IS SA) <5: 

Sine f,,(x) is integrable on [a, b], there is a 6 > 0 such that 
Sa Ja) —Sn( Sa) <5 


for any partitions II and IT’ with d(Il) < 6, d(I1’) < 6. But then it follows from (3) 
and (4) with the same partitions that 


ISn( 4) —Sa(F) SIS Sn) — SF) | + St Sn) | + St (FS) | 


a. &@ & 
Therefore the Cauchy convergence criterion for the existence of a limit in the 
direction d(II) — 0 (Theorem 4.19) is satisfied for the Riemann sums of f(x), and 
hence f(x) is integrable on [a, 5]. 
To prove the second assertion, we note first that f(x), like f,(x), 1s integrable on 


every interval [a, €]  [a, b], by Theorem 9.15i. Since f(x) converges uniformly 
to fix) on [a, b], given any ¢ 7 0, there is an N such that 


Ix) —fr@)| Se 


for all n 7 N and all x © [a, b]. But then 
g é é 
IA) 4 [ne 4 =|" TA) 100 a 
< Me ~f,(x)| de<e(é—2) <e(6—a) 


for all 7 N and all & © [a, b], which implies (2). ff 

9.103. The above theorem has a natural analogue for a series of functions: 
THEOREM. Let 

p(x) + Go(x) +... +O) +... 


be a series of functions, each integrable on [a, b], which converges uniformly on 
[a, b] to a sum function (x). Then (x) is also integrable on [a, b|. Moreover, 
the formula 


n é rs 
lim ¥ | ¢, (x) de= | (x) dx 


no k=1 Ja a 


holds uniformly for all & © [a, b], and in particular 


a) b b 
3 | a(x) éx= | (x) de. 


k=1 


Proof. Choose 
fle) = ¥ onl) 


in Theorem 9.102. &f 


In other words, uniformly convergent series of functions can be “integrated 
term by term.” 


9.104. Examples 


a. The series 


l 


1+x2 =] —x?2 4x4 x64... 
x 


both converge uniformly on every interval [—b, b], 0 < b <1 (see Theorem 
6.65a). Integrating these series term by term from 0 to ¢, we get the new series 


2 3 > 
ch le aa Saale ie” (5) 
3 5 7 
t =f — — a eee 6 
arc tan €=€ 7s ers (6) 


which also converge uniformly on [— b, b]. Moreover, since the series on the 
right in (5) and (6) converge for € = 1 (Theorem 6.23), while the functions on the 
left are continuous for 0 S é $1, we find that 


| 


fee Fak | ce we eth es 
n2 al a dike 

Tl i... 4 & 

ater) PaCS Mika ae See 

4 375 5 (7) 
(Theorem 6.68). 


b. Theorem 9.103 can be used to find series expansions of nonelementary 
functions. For example, 


sin x x? x* = x® 


a a a 


converges uniformly on every interval [— 7, T]. Hence, integrating (8) term by 
term from 0 to ¢, we get the following expansion of the sine integral (Sec. 
9.46b): 


fe ves (8) 


go 3 5 r7 
si ¢= | ala ee ; _ § i. (9) 
o £ J°O: °*J5 ee 
The series (9) is also uniformly convergent on every interval [—T, 7]. 


9.105. Next we consider differentiation of sequences of functions, asking 


questions analogous to those posed in Sec. 9.101 for the case of integration. Let 
fi), fo), ---» f(x), -..(a Sx Sd) 


be a sequence of differentiable functions converging everywhere on [a, b] to a 
limit function f(x). Then we ask: 

(a) Is the function f(x) also differentiable? 

(b) If so, does 


F's) = lim f, (#)? (10) 


As shown by the following examples, the answers to both questions are again 
negative in general, even for a uniformly convergent sequenc f,,(x): 


(a) The sequence of differentiable functions 
f(x) - Lx|! + (1/n) 


converges uniformly on [—1, 1] to the function f(x) = |x| which has no derivative 
atx =0. 
(b) The sequence of functions 


f(x) =| sin nx(0 Sx Sz) 

n 
converges uniformly to zero, and hence the limit function is obviously 
differentiable. However, (10) fails to hold, since the sequence /’,(x) = cos nx 
converges nowhere except at the point x = 0. 


9.106. The situation changes if we assume that the sequence of derivatives f(x) 


converges uniformly. It then turns out that much less need be assumed about the 
sequence of functions f,(x) themselves: 


THEOREM. Let f,(x) be a sequence of piecewise smooth functions on [a, b], which 
converges for at least one point Xq © fa, b], and suppose the sequence of 
derivatives f(x) converges uniformly on [a, b] to a piecewise continuous limit 
function g(x). Then the sequence f,(x) converges uniformly on [a, b] to a 
piecewise smooth function f(x) with derivative 


J’ (#) = lim fy (x) =g(*) 


nc 


at every continuity point of g(x). 


Proof. We have 
KO —fivo)= | A) de (n= 12,..2), (11) 


by Theorem 9.33, where the sequence (11) is uniformly convergent for all & © [a, 
b], by Theorem 9.102. But then f,(x) converges uniformly on [a, b], since the 


numerical sequence f"(x) is convergent. Let f(x) denote the limit of f(x). Taking 
the limit as n — oo in (11) and using Theorem 9.102 again, we get 


é 
f®) —fl%) = [ id 


But g(x) is piecewise continuous, by hypothesis, and hence, by Theorem 9.31, 
fix) is differentiable everywhere except at the discontinuity points of g(x), with 
derivative 


J'(*) =g(*)=lim f, (x). 
no 
9.107. As in the case of integration, the above theorem has a natural analogue for 
a series of functions: 
THEOREM. Let 
P1(*) + P2(*) +2 + Oq(*) +> (12) 


be a series offunctions, each piecewise smooth on [a, b], which converges for at 
least one point xq © [a, b], and suppose the series of derivatives 


Dp (x) +94 (x) +2 +9,(x) +° 


converges uniformly on [a, b] to a piecewise continuous sum function f(x). Then 
the series (12) converges uniformly on [a, b| to a piecewise smooth function 9(x) 
with derivative 


g'(x) = g(x) 
at every continuity point of g(x). 
Proof. Choose 


fle) = onl) 


in Theorem 9.106. ff 


9.11. Paeasnetee-Dependent Integrals 
9.111. rHeorem. Let f(x, t) be a real function of two variables x © A = [a, b], t© M, 
where M is a metric space with distance p, and suppose f(x, t) is uniformly 


continuous in both x and t, i.e., on the product metric space O = A x M.t Then 
the function 


b 
@(t) = | flxt) dx (te M) 


is uniformly continuous on M. 


Proof. Let 
oy(6)= sup | flx'st’)—Sle"t") 
mearyse 


be the modulus of continuity of the function f(x, t) on the space Q (see Sec. 
5.17c). Clearly, 


or’) -0(0")|= | Lilet’) flat”) ae 
b 
< [ Lflest”) —flast")| de <0,(5)(6—a), 


and hence 
@e(d)= sup |P(t’) —®(t")| <a@,(d)(b—a). (1) 


alt’ ,t")<3 
But given any ¢ ~ 0, there is a dy 7 0 such that 5 < dp implies od) < «, by the 
uniform continuity of f(x, f) on Q. It follows from (1) that 6 S dp also implies 
Wg(0) < e(b — a), so that ®(2) is uniformly continuous on M. 


9.112. rHeorem. /f f(x, t) is a continuous real function on the rectangle a Sx <b, 
a StS, then 


[flea a= ff fran . 


Proof. The functions 


b 


F (x)= | ST (x,t) dt (a<x<b), 


®(t)= [_fese) dx (a<t<) 


are both (uniformly) continuous, by the preceding theorem. Let G(f) be any 
function continuous on [a@, /], and let 


n-1 
> G (t,) At, (At, = bye 1 — bes te ST Sly a 1) 
k=0 


be a Riemann sum of G(t) corresponding to a partition I of [a, £] with d(I]) < 6. 
Then, by Theorem 9.15k, 


[ G(t) “oy G(t,) 


< wg (5)(B—«), (3) 


where (0) is the modulus of continuity of G(#) on [a, f]. Applying this 
estimate to the function f(x, ) with x fixed and Tt, = t, we get 


p peer 
co dt— 2 Sete) At <«,(5)(B—«), 


which gives 


| {| eco at ds es {. 1¥ fle) An dx 


after integration with respect to x. On the other hand, applying (3) to the function 
M(t) with 7, = t we get 


<a, (5) (B — a) (6 —a) (4) 


I. O(t) di—"F O(4) At 
a k=0 


| { [fe ax} dt—"S | [snd ax} At, 


< Wo (5)(B—a) <a@,(5)(6—a)(B—a), (5) 
by (1). Since obviously 


F nenan adem y 1 | fy) ecb as, 
, J 


k=0 a 


it follows from (4) and (5) that 


[Ameo 


which implies (2), since the right-hand side can be made arbitrarily small. 


<2a,(b—a)(B—«), 


9.113. Example. If 
Kx, )=x' (0 Sx $1,0Sa SK), 


then (2) becomes 


[tee}an tf fee}a 


Evaluating the “inner integrals,” we get 


— diate SE 


o Inx l+a 


Any attempt to evaluate this integral directly by the methods of Secs. 9.4 and 9.5 
must confront a formidable obstacle, namely the fact that the indefinite integral 
of the function 


af a 


In x 
cannot be expressed in terms of elementary functions. 


9.114. tHeorem. Let f(x, t) and its derivative with respect to t, namely the 
functiont 
yt eT et ges Se el RE le tal a 2 


h~>O 


(x fixed), (6) 


be continuous on the rectangle 
a Sx Sb,t)— 5 St St) + H(6 7 0). 


Then the function 


a 


b 
0(t) = | flat) de (|ttel <4) 
is differentiable at t = ty, with derivative 


(0) = | sista) de (7) 


Proof. Applying Theorem 9.112 to the function f(x, ft), we get 


{Jaen a} [ff san 


b 
= | fost) a— | flsste) dx (Wt <3), 8) 
with the help of Theorem 9.33. The left-hand side of (8) has a derivative with 
respect to ¢ equal to the integrand, by Theorems 9.31b and 9.111, and hence the 
right-hand side has the same derivative. But the second term in the right-hand 
side is independent of ft, so that its derivative with respect to ¢ vanishes. Equating 
derivatives of both sides of (8) at the point ¢ = fo, we get (7). | 


9.115. Example. Evaluate the integral 


O(t) = \ In (t? —sin? x) dx (t>1). 


0 


Solution. It follows from (7) that 


"l2 Ot dx 
®’(¢) = a 
(4) |. 1? —sin? x 
nj/2 


if} or 
= _ tan yea! tan :) = en (9) 
t 0 Jt*—1 


Jt?-1 


(verify that the conditions of the theorem are satisfied). Integrating (9) with 
respect to ¢, we get 


0) =| So =n In (t+,/t?-1) +C. 


Jt?-1 


To find the value of the constant C, we take the limit as t > , or equivalently 
as tT = 1/t — 0, in the formula 


n/2 
C= | In (¢? —sin? x) dxe—2 In (t+,/t?—1) 
0 


n/2 2 ae | 
-| In (1 *) ax—n int tY (10) 


0 ia 


valid for all ¢ 7 1. First we note that since the function In (1 — 7 sin? x) is 


continuous on the rectangle 0 <x <7/2,0<rS To < 1, it follows from Theorem 


9.111 that the function 
n/2 

Y(t) = | In (1 —1? sin? x) dx 
0 


is continuous for 0 <1 Sz, and hence that 


t0 r0 JO 


{2 
0='‘P(0) =lim Y(t) =lim | In (1—1? sin? x) dx, 


so that the limit of the first term in the right-hand side of (10) as t — 0 vanishes. 
Therefore 


GF=T __ 
C= —alim {V6 —! | aim in (14+. 1-2) = —2 In? 
t+2 0 
and hence finally 


XT 
(t)=mIn (t+ /22—1) —nIn Qenin event 


9.116. Next we consider a somewhat more complicated case, Where the limits of 
integration as well as the integrand depend on the parameter f: 


THEOREM. Let f(x, t) and f(x, t) be the same as in Theorem 9.114, and let g(t) be a 
differentiable function on the interval ty — 6 <7 ty + 0, taking its values in the 
interval a <x Sb. Then the function 


p(t) 
o=| flat) de (|t—t6 <9) 


is differentiable at t = ty, with derivative 


P(to) 
®' (tp) = | Filsto) d¢+f(0(to)sto) 0’ (te) (11) 


Proof. Let 
O(t) =, (¢)+®,(¢), 
where 


(to) e(t) 
(0) = | sose) as ,(0)= |" fost) ax 


a (to) 


By Theorem 9.114, ®,(¢) is differentiable at t = tp, with derivative 


® (to) = | we tet) de. 


The derivative of ®,(f) at t= ¢) can be calculated directly. In fact. 


D3(to+h)-—Dz (to) 1 (°to*” 
h h 


S(x,to +h) dx 


(to) 


‘ Pho tO Oe) 0,t0 +8) ({h| <8), 


(12) 


(13) 


by Theorem 9.15f, where 0 is a number between (fp) and g(t) + h). Taking the 
limit as h — 0 and using the properties of the functions f(x, t) and g(2), we get 


3 (t) =9' (to) S(O (to) sto). 
Comparison of (12)—(14) then gives (11). ll 


In particular, if (2) = t, then (11) reduces to 
t , to 

[ste ae] = [silta) att) 
a t=Io a 


9.117. Example. Differentiate the function 
®,(1)=—__ | (t—2)""1f(2) de (n= 1,2,....), 
(n—1)! Ja 


where f(x) is continuous on the interval a <x Sb. 


Solution. It follows from (11') that 


%)=— | (n—1)(t—a)"" f(x) de+ LO" 41) =0,_, (0). 


(n—1)! Ju (n—1)! 


(14) 


(11’) 


(15) 


But obviously ®,(a) = 0, and hence 


t t 


©! (x) i= | ©, _ (x) de. 


®,(0)= | 


Noting that 


t 
,(1)= | fle) de 
we see that ® (7) is the result of n consecutive integrations of the function f(x) 
from a to ft, 1e., the operation leading to @®,(¢) is the inverse of n-fold 


differentiation. Formula (15) represents this operation in the form of a single 
integral involving a parameter. 


9.12. Line Integrals 


9.121. Definitions and basic properties. As in Sec. 9.71, let L be a curve in n- 
dimensional space with parametric equations 


*,=x,(t), *2=*3(t),..-, x, =x,(t) (a<t<b), (1) 


where the functions x(¢) are continuous with continuous derivatives x’(¢) 
satisfying the condition 
n 


> [xj(t)}]?>0  (a<t<b), (2) 


j= 
so that ZL is smooth and has no singular points (Sec. 9.74). We will think of LZ as 
equipped with the “direction of increasing ¢,” namely, the direction of motion of 
the variable point P(t) with coordinates (1) as ¢ varies from a to b. Such a curve 
with an assigned direction will simply be called a (smooth) path.”+ As in Sec. 
9.74, let s(t) be the length of the arc of Z joining the initial point of Z (with 
parameter value a) to the variable point P(t). Then 


si= [4] & bor 


and we can choose s as our parameter instead of the original parameter ¢. The 
equations (1) then become 


x, =x,(5), x2 =X2(s )s-- x, =x,(S) (O<s<iJ), (1) 


where xs) = x,((s))} and / is just the length of L. By the same token, the 


variable point with coordinates (1') is denoted by P(s). The function s = s(t) has a 
continuously differentiable inverse ¢ = ¢(s), with derivative 


t'(s)= ay 

Now let 
= {0=59 S58... he 

be a partition of the interval 0 Ss S/, and let 
= {a=t) 81, &. = 5} 


be the corresponding partition of the interval a <t <b, where 
Sy = S(t oly = tS,). 

As usual, let 

d(I1,) = max {Aspo, As), ..., As, 1}, 

A(II,) = max {Ato, Ady, ..., At, — 1}, 

where 

a ep 


Moreover, let f(P) and g(P) be two real functions defined on L, 1.¢., at every 
point P © L, and form the (Riemann-Stieltjes) sum 


EA S(P7)(g(Pj+1) —8 (PI, (3) 
where 
P,;=P(s;), Pf =P(s;) (5; SSF S55 44) 


(so that P? is an arbitrary point of the arc PP, al: Then the limit of the sum (3) 
as d(II,) — 0, provided it exists, is called the (Stieltjes) line integral of the 
function f(P) with respect to the function g(P) along the path 4B and is denoted 


by 
of?) de(P). (4) 


AB 


Alternatively, we can write 


Kt) = APO, gO = s(PO) 


and form the sum 
p-l 
LSet. 1) —8(t,)], (3’) 


where t;<¢? <¢;4. But d(1,) > 0 implies d(1,) — 0 and conversely, since 
At;=1'(0,)As; (5j<0j<5j41)5 
As ,;=s'(t,) At; (t;<t;<t)44)- 


Therefore if either of the sums (3) or (3’) approaches a limit under arbitrary 
refinement of the partition, then so does the other, and investigation of the line 
integral (4) is equivalent to investigation of the limit of (3’) as d(II,) > 0, 1. e., of 
the ordinary Stieltjes integral 


b 
[AO dete 
(Sec. 9.55a). This integral exists and equals 


b 
| flt)e'(e) dt (5) 


if f(¢) is continuous and g(f) has a continuous derivative (Theorem 9.55b). 


Next we establish a few properties of line integrals. Let 4B be a path with 
initial point A and final point B. Then obviously 


| -4e()=e(8)-8(4) 


for every function g(P), while 


[af dg(P) =0 
AB 


for every function f(P) if g(P) is constant along 4B. Moreover, it is easy to see 
that if either of the integrals (4) or 


[aMP) det) 
exists, then so does the other and 
| ft) de(P)=— |__f¢P) del?) 


Furthermore, we have the following analogues of the formulas of Sec. 9.55c 
(provided the integrals on the right exist): 


9 (2, f,(P) +a,f,(P)] de(P) =a, of (P) de(P) +a, [aflP) de(P), 


(6) 
[_.AP) 4(B,g\(P)+B2g2(P)J=B, | AP) dg,(P) +B. | J(P) dg2(P), 
(7) 
[ MP) det?) =| fr) dete) + |_fP) del) (Ce 4B), 
B 
[ af) det?) =f) (P| — [(P) afl?) 
AB A JAB 
(a, A, fi, Po real). 
A sum 
| filP) dey(P) +--+ | — ful) dP) 
of integrals over the same path 4B is written more concisely as 
[ag filP) des(P) +4 falP) dP) 9) 


or even just as 


| of dg, ++ +fim 18m» (9") 
4 


where we retain only one integral sign. If the functions f(7) are all continuous 
and the functions g((¢) all have continuous derivatives, the integral (9) exists and 
equals 


b 
[1 Ai(edei(t) +t faledet (Ode 


The above theory is easily extended to the case where the functions x,(¢) 
figuring in (1) are continuous but their derivatives x'(¢) are only piecewise 
continuous and satisfy the conditions 


y [x}(¢+0)]7>0, y [x}(t-0)]?>0 (a<t<b) (2°) 


instead of (2), so that LZ is piecewise smooth instead of smooth. In fact, L is then 
the union of a finite number of smooth arcs 


Ly =AC,, Leet Ponies Ln =C,- 1B, 


Figure61 


and we need only write 


. 


Rie de(P)= | _f(P) delP) + | fiP) de(P) +--+ | f\P) de(P), 


Li L2 Lm 


by definition. This is consistent with the definition of the left-hand side as the 
limit of the sum (3) as d(II1,) — 0 in the case where s(f) is only piecewise smooth 


instead of smooth. 


9.122. Example. Evaluate the line integrals 


> 


fim | _ iP) 8 h= | (=9) rs 
. AB AB 


where 4B is the arc of the parabola y = x? joining the points A = (-1, 1) and B = 
(1, 1), as shown in Figure 61. 


Solution. In the first case, choosing x as the parameter, we get 


in a 5 
I,= | (x? —x*) dx= (2 -3) 
=I 


: Oo 2 4 
3 5 


\ 


* 


In the second case, we cannot choose y as the parameter, since two points of the 
arc correspond to each value of y (except y = 0). Therefore we again choose x as 
the parameter, obtaining 


I 
=(), 


= 


Qx* 2x 
6 


1 
I,= (x? —x*)2x x= (Fe - 
3 4 


with the help of (5). 


9.123. Line integrals along a closed contour. By a closed path we mean a path 


_— 


AB whose end points A and B coincide. A closed path is often called a (closed) 
contour, and the line integral along a contour L is often denoted by 


} S dg. (10) 
L 


The quantity (10) does not depend on the choice of the initial point A, since if A’ 
is any other point, then, by (8), 


| fde= |_fae+ | fae |_fae+ | 4 = fide. 
AA’ A'A A'A AA’ A'AA' 


AA‘A 
However, (10) does depend on the direction in which the contour is traversed, 
since changing this direction changes the sign of all the differences g(P; + ;) — 
g(P;) and hence the sign of the integral (10) as well. Thus, using arrowheads to 
distinguish directions along the contour L, we have 


fran fs 


Note that this notation is useful only in the plane (7 = 2). 


9.124. Area in terms of line integrals. Consider the plane figure shown in 
Figure 62 bounded on top by a curve LZ, with equation y = g, (x) and on the 


bottom by a curve L, with equation y = g>(x). The area of this figure is just 


s= |" 94(s) a |. 920) ds. (11) 


By the very definition of a line integral, we can write (11) as a difference of line 
integrals 


li L2 


Figure62 
J 


Figure 63 
or equivalently, in the form 
S= | Ps dx + | J dx, 

Ii —L2 


where —L, denotes the path L, traversed from B to A instead of from A to B, 


afterwards adding two more (vanishing) integrals along the vertical line 
segments (if any) forming the sides of the figure,+ Combining all these integrals, 
we can express the area of the figure as the following line integral along the 
closed contour L forming the boundary of the figure: 


s= yd 9 ds (12) 
L L 


To derive a more symmetric formula for S, we interchange the roles of x and y 
(see Figure 63). This gives 


S= xe dy — [: W2(9) d= {3 4 | x dy 


La 


-| i+ | b= d so (12’) 
L; —Ls L 


Adding (12) and (12') and multiplying by 4, we get 
S= 5 #9 de (13) 
2JL 
Formula (13) often leads to particularly simple calculations. For example, it 
follows from (13) that the area enclosed by the ellipse 
x=acosty=b sin (0 St $27) 
is just 
I it iin ceil 
S=-Q@ x dy—ydx=-~| ab(cos* t+sin* t) dt=nab, 
2S 2J 0 


in keeping with Example 9.62e. By contrast with (13), it is interesting to note 
that the formula 


B 
. = x(B)y(B) —x(A)9(A) 


la" dy+y dx= | d(xy) =xy 

AB JAB 

implies 

} x dy+y dx=0 

L 

for any closed contour L.t 

9.125. We now prove two theorems about line integrals of the special type 
[JulP) dey t ti(P) dy (14) 


a. rHeoreM. Let L be a path in n-dimensional space, with parametric equations 
(1), where the functions x{t) are continuous with continuous derivatives x'(t) 


satisfying the condition (2), and suppose the integral (14) exists. Then 


eae dx, +-++f,(P) dx,| </sup ¥ SPP), (15) 
L PeL j=l 


where | is the length of L. 


Proof. The expression 
p-1 8 
YY Fy Pu) Le i(tes 1) —¥4(4)1, 
K=0 j= 
where P, = P(t,), 1s a Riemann-Stieltjes sum for the integral in the left-hand side 
of (15) corresponding to a partition 
Il= {a= ty St, S... S, =}. 


Applying Cauchy’s inequality (Sec. 3.14b), we get 


= LSP) Ax 
k=0 j=1 


< Ey LIPO yf belted 0? 


<sup 4/ D /P(P) s(n), (16) 


where s(Zy)) is the length of the polygonal line Ly = PoP, ... P, inscribed in L. 


But s(Ly) — / as d(II) — 0, where / is the length of Z. Hence, passing to the 
limit as d(II) > 0 in (16), we get (15). 8 

b. rneorem. Let L be the same as in the preceding theorem, and suppose the 
functions f, (P), ...,f,(P) are continuous on L itself and on some neighborhood of 


L containing all polygonal lines Ly with sufficiently small segments inscribed in 
L. Then 


tim | fil) dit bial) dy 
Ln 


d(T) 0 


- | J (P) dey to +h, (P) dy (17) 


so that the line integral along L can be deduced from a knowledge of the line 
integrals along all polygonal lines Ly (with sufficiently small segments) 


inscribed in L. 


Proof. It is enough to prove (17) for one term, 1.e., to show that 


lim S(P) a= | HP) dx ;, 
4(M1)-0 Jin L 
or simply 
lim | f(P) de= | f\P) dx, (18) 
d(1)-0 JLn L 
where for brevity we drop the index j. Given any ¢ 7 0, choose 6 7 0 such that 
d(I1) <6 implies 


| | AP) dx" f(P,) Ax | <2, (19) 
L k=0 


while p(P, P’) < 6 implies \KP) — f(P’)| < ¢ (justify the latter implication in 
detail), + Let P,P, , be any segment of Ly. Then, by (15), d(I1) <6 


implies 


[AP a fe) de 
PRPeet PPK +1 


| fiP) dx—f(P,) Ax, 
PuPKet 


<el,, (20) 


|, (AP) =flP) es 


where /, is the length of the segment P,P, , ;. Summing over all the segments of 
Ly, we get 


| AP) de—'¥. f(P,) Ax, | <es(Lp). 
Ln k=0 


But s(Ly) <1, where / is the length of L (see Sec. 9.73). Hence (19) and (20) 
together imply first (18) and then (17). 


Both of the above theorems can be extended to the case of a piecewise smooth 
path L by the usual device of dividing L into a finite number of smooth arcs. 


Problems 


1. Prove that if f(x) is periodic with period T on (—«, 0), then 
Fe)= | fe) & 


is the sum of a periodic function and a linear function. 
2. Prove that if (x) and g(x) 2 0 are integrable on [a, b], then there exists a point 
& © [a, b] such that 


é 


[ feos) d= 9(0+0)| fs) ds 


if (x) is nonincreasing on [a, b] and a point © [a, b] such that 
b b 

[ Se)0() de=0(6-0) [| fx) a 
a n 


if g(x) is nondecreasing on [a, b]. 
3. Prove that if f(x) and g(x) are integrable on [a, b] and if g(x) is monotonic on 
[a, b], then there exists a point € © [a, b] such that 


b é b 
| seo demig(a40) [fe dx +9(b—0) [ fle) de 


Comment. This result is often called the second mean value theorem for 
integrals, as opposed to the (first) mean value theorem for integrals (Theorem 


9.134). 
4. Prove the following formula for repeated integration by parts (valid for 
“sufficiently smooth” u and v): 


& 
| uv*)) dy =[uv™ —u'v- 9) 4... + (—1)"u™d] 
a 


b b 
+(e u@+ Dy dx. 
a 


5S. (Riemann’s criterion) . Let Qa, f) be the oscillation of the function f(x) on 
the interval [a, f], 1.e., the quantity 


Q/(4,8)= sup | f(x’) —f(*")]. 


x'x"e[«,B] 


Prove that a (bounded) function f(x) is integrable on [a, b] if and only if 


a=) 


lim ¥ 6 (%4_,%q4 1) AX, =O, 
d(11) +0 k=0 


where II, d(I1), and Ax, have the usual meaning (Sec. 9.11). 


6. Prove that every bounded monotonic function is integrable. 
7. Prove that if f(x) is integrable, then so 1s |f(x)}. 
8. (Du Bois-Reymond’s criterion). Let Qc) be the oscillation of the function f(x) 


at the point c, 1.e., the quantity 


a<c<B 


where Qa, f) is the same as in Problem 5. Prove that a (bounded) function is 


integrable on [a, b] if and only if given any ¢ 7 0 and 67 0, the set of all points c 
© fa, b] at which Qc) 2 « can be covered (Sec. 3.93c) by a finite number of 


intervals the sum of whose lengths does not exceed 0. 

9. (Lebesgue’s criterion). Prove that a (bounded) function f(x) is integrable on 
[a, b] if and only if, given any 6 0, the set of all discontinuity points of f(x) can 
be covered by a finite or countable number of intervals the sum of whose lengths 
does not exceed 0. 

10. Prove that if f(x) is integrable, then so is the function 1/f(x) provided it is 
bounded. 

11. By integrating the inequality 


1 


ant ly <sin?"x<sin2"~!x 


sin 


from 0 to z/2 and then using formulas (1) and (2), p. 309, prove Wallis’ product 
2+2+4+4---2n-2n = I 


7 
—~=]i OOOO - . 
2 nae 1-3°3-5-(Qn—1)(Qn+1) Hh 1—(1/4n?) 


12..Let 

Il = {a =X, x1, «.., X, = 5} 

be any set of points of an interval [A, B] > [a, b], arranged in any order, and let 
AAT) = max {[x; — Xol, by — Xy|, «+s Bn — Xn - I}. 


Consider the “generalized Riemann sum” 


Saf) = ¥ SE 4441-4), 


where f(x) is defined on [A, b] and ¢, is any point between x, and x, 4 , (including 
the points x;,, x, 4, themselves). Prove that 


b 
lim Saf) | fla) 4 


a(TI)+0 


for every function f(x) continuous on [A, B], provided that 
n~-1 
Dy lee Xl <C (1) 
k=0 


for alln = 1, 2,... 

13. Prove that the conclusion of the preceding problem breaks down if the 
condition (1) is dropped or if f(x) is allowed to be piecewise continuous. 

14. (Axiomatic definition of the integral). Suppose that with every piecewise 
continuous function f(x) on the interval [a, b] and with every subinterval [a, 8] — 
[a, b] there is associated a number JA(/) satisfying the following conditions: 
(alE(kf) = KIB(f) for all real k; 

(byIE(f) + I2(g) = IF + g) for all piecewise continuous f(x) and g(x); 

(c) IX) =B - a; 

(d) 2(f) =2(f) +14 (fF) for all y € (a, B); 


(e) ZP(f)| <C aUP | f(x) | for some fixed constant C. 
acxs 


Prove that 
B 
uis)= | Fla) (2) 


for all piecewise continuous /(x). 

15. The series (7), p. 353 converges “too slowly” to be useful in calculating the 
number z. A more practical way of calculating z is to use the “rapidly 
converging” series 


x i l l l 

ey a ee, | 3 

6 ral 3-335 35.77 ©) 
Prove (3). 


+ See esp. Secs. 9.38 and 9.53. 
+ See also Theorem 5.17b, which shows that f(x) is uniformly continuous on [a, b]. 
t Note, however, that there are actually no new points of subdivision in [x;, x7 4+ 1] if mz = 1. 


+ Here, of course, we have in mind the direction 7 or d(II) — 0 discussed on p. 275. 


+ The symbol J is called the integral sign, and the function appearing behind the integral sign (i.e., the 
function “being integrated”) is called the integrand. The process leading from a function to its integral is, of 
course, called integration. 


+ Actually, the integrability of f(x) implies that of |fx)| (see Problem 7). 

+ The values of f(x) at the point x; themselves can be arbitrary. 

+ It can be shown that every bounded function with only finitely many points of discontinuity is integrable 
(see Problem 9). 


t See, e.g., G. E. Shilov and B. L. Gurevich, /ntegral, Measure and Derivative: A Unified Approach 
(translated by R. A. Silverman), Dover Publications, Inc., N.Y. (1977). 


+ The case of the area enclosed by a circle is no exception, since even in elementary geometry this area is 
calculated by nonelementary means (with the help of limits). 


{ Given a finite number of line segments o1, 02, ..., G7, Suppose that 


(1) The final point Qj of oj coincides with the initial point Py + 4 of oj +1QG=1 2, ..., 1), where o, + 1 by 
definition; 

(2) Two segments oj and o, have points in common (if and) only if « = 7 + 1, in which case Qj 7 Pj 4 1s 
the only shared point. 

Then the segments 01, 02, ..., G, are said to form a simple closed polygonal line. 


+ Because of the inadequacy of the high school definition of a limit, the reader should examine the 
definition of the length of a circular arc in the light of the more general theory of arc length given here. A 
reexamination of the elementary definition of the area enclosed by a circle from the standpoint of the 
considerations leading to formula (2) is also in order. 

+ See Secs. 9.15 and 9.18. Ifh <0, we set [c, c+ h] = {x:c + ASK} by definition. 


+ In other words, the “invariance property” of differentials (Sec. 7.34) continues to hold for differentials 


appearing behind the indefinite integral sign. The importance of writing du in the expression J f(u) du is now 
apparent (see also Sec. 9.54). 


+ Since P(x) has real coefficients, if z = a + if is a root of P(x), then the complex conjugate Z=0- if is 
also a root, with the same multiplicity (Sec. 5.88). 


+ If [a, b] contains xz, the integrand fails to be piecewise continuous, and the underlying theory of Secs. 
9.32—9.34 is no longer applicable. 


+ See e.g., G. M. Fichtenholz, The Indefinite Integral (translated by R. A. Silverman), Gordon and Breach, 
N. Y. (1971), Chapter 2. 


+ Here, of course, R is a different rational function from that appearing in (9), but one easily found from the 
latter. 


t In (12) we choose the increasing branch of arc cos u (Sec. 5.67), corresponding to choosing [—z, 0] as the 
domain of cos @, so that ./ ] — y? =~—sin 6 (as always, the radical denotes the positive square root). Using 
the decreasing branch of arc cos u corresponds to choosing [0, 2] as the domain of cos 0, so that ./] — y2 
=-—sin 9. We then get 


lots =—arc cos u+C, 


instead of (12), with the connection between the two results now being given by 
arc cos u = — are sin u 


2 
(Sec. 5.67) instead of (13). 
+ For simplicity, we revert to the usual rectangular coordinates x and y. 
t This is the Abel-Liouville theorem, proved, say, by N. G. Chebotarev, Uspekhi Mat. Nauk, vol. 2, no. 2 
(1947), pp. 3-20. By an elementary function we mean any function formed from polynomials, exponentials, 
logarithms, trigonometric functions, and inverse trigonometric functions by a finite number of algebraic 
operations (addition, subtraction, multiplication, and division) and a finite number of compositions (1.e., 
formation of composite functions). 


+ F(k, @) are E(k, @) are known as Legendre’s incomplete elliptic integrals of the first and second kind, 
respectively. 


+ The proof is given in Fichtenholz, The Indefinite Integral, Appendix. 
+ Fichtenholz, The Indefinite Integral, Sec. 5, Problems 3 and 5. 


t For a proof of this assertion, see G. H. Hardy, The Integration of Functions of a Single Variable, second 
edition, Cambridge University Press, London (1928). 


+ Here, and in the subsequent theorem, we again see the importance of writing du in the integral on the left 
(recall the discussion on p. 275). In fact, the formula du = u'(f) dt for the differential of u automatically 
leads to the correct integral on the right (cf. Theorem 9.38). 

t Obviously, [A, B] - [a, b] = [u(a), u(B)]. As an exercise, verify that the set {u: u = u(t), a St SP} is 
actually a closed interval, as asserted. 

+ In fact, ifa > b, then 


|, ft2) &e=—[" fx) x= —[F (a) -F(6)] =F (8) —F (a) 


(see Sec. 9.18). 


+ The existence of (12) follows from Theorem 9.14e and the assumed continuity of f(x) and g’(x). As an 
exercise, the reader should examine what happens if f(x) or g'(x) or both are only piecewise continuous. 


t The details are left as an exercise. For more on the Stieltjes integral, see e.g., Shilov and Gurevich, 


Integral, Measure and Derivative: A Unified Approach, Part 2. 


+ See the discussion pertaining to Figure 82, p. 487, as well as the remarks on p. 486, concerning the case 
where ® is a union of more general figures with area. 


t More exactly, the area of the curvilinear trapezoid bounded on the bottom by the curve y = f(x), on the top 
by the segment [a, b] of the x-axis, and on the sides by segments of the vertical lines x = a and x = b. 


+ A function f(x) is said to be odd if x) = —f(x) and even if f(x) = f(x). Clearly (0) = 0 and 
[° fe) de=0 

if f(x) is odd, while 

[i fle) d= 2" fee) ds 


if f(x) is [even (why?). 

+ As an exercise, the reader should contrive a purely analytic proof of (8). 

+ Pz is the point of the curve r = /(@) with polar coordinates rz = (;) and 9. 

t See the discussion following formula (2), p. 289. 

+ As in Sec. 9.39, this means that x) is continuous on [a, 5], with a piecewise continuous derivative x(t) 
on [a, 5]. If the functions xj) are all piecewise smooth, the curve L itself is said to be piecewise smooth. 

+ We interpret x'(a) as x'(a + O) if x’) is discontinuous at t= a. 

{ At discontinuity points we replace this condition by the requirement that the quantities xt + 0) do not 
vanish simultaneously, and similarly for the x(t — 0). 


+ Thus / is the length of the generator AB and r the radius of the directrix I. 

+ The first inequality is obvious, and the second follows from the triangle inequality applied to the vectors ( 
oa, 0) and (0, 9/6) in the plane. 

+ As in the transition from (1') to (2'). 

+ Also see Appendix B. 

t It is also assumed that V and the part of V between every pair of horizontal planes have volume, and that 
the boundary of every cross section Vy is made up of one or more curves with length. 

+ See Secs. 2.82, 3.16, 5.17a, and 5.18. 


+ A derivative like (6), involving a function of n variables with n — 1 variables held fixed, is called a partial 
derivative. Here, of course, we have the simplest case, where n = 2. 


+ To evaluate the indefinite integral in (9), make the substitutions 


ie a oes u? d sa du 
u=tan x, sin alae rare es 


as in Sec. 9.43b. Note that (9) entails making a slight generalization of Theorem 9.33 (which?). 


+ It will also be tacitly assumed that a path has no singular points, i.e., that the condition (2) holds (or the 
analogous condition (2') below). 


{ There is a slight abuse of notation here, since the function x;(s) is not the same as the function x0), with ¢ 
changed to 7, but the context precludes any confusion. The same observation applies to writing f(t) = AP()), 
g(t) = g(P(d) below. 

+ In Figure 62 these segments reduce to the points A and B. 

+ Note that 


f fox) dx=$ e(r) dy=0 


for arbitrary continuous functions f(x), g(y) and any closed contour L (why?). 
+ By p(P, P’) we mean the length of the segment PP’ with the usual Euclidean metric (Sec. 3.14a). 


1 0) Analytic Functions 


10.1 Basic Concepts 


10.11. Differentiation in the complex domain 


a. Let C, be the “plane of the complex variable z = x + iy,” 1.e., the set of all 


complex numbers of the form z = x + iy where x and y are arbitrary real numbers, 
and let C,,, be the “plane of the complex variable w = u + iv,” 1.e., the set of all 


complex numbers of the form w = u + iv where wu and v are arbitrary real 
numbers.} Then a function w = f(z) defined on a set E— C, and taking values in 
C,, 1s called a (complex) function of a complex variable. Let Zp © E bea 
nonisolated point of E (Sec 3.75c), so that every neighborhood of zp contains a 
point of £ other than zg itself. Then a complex number 4 is said to be the 
derivative of the function w = f(z) at the point z = Zp relative to the set E, denoted 
by /e(Zo), if, given any ¢ 7 0, there exists a 6 7 0 such that 0 < |z — zo|* 6,z°E 
implies 

4 L2)-Sl20) 


~ > 
“~~ £0 


<6. 


In this case, we say that w — f(z) 1s differentiable at z = zy, with derivative 
A=f¢(Zo)., where A is clearly the limit of the “difference quotient” 


F(z) — f (Zo) (1) 


in the direction *F “9 determined by the set of intersections of E with all 


“deletedneighborhoods” 0 < |z — zo o (cf. Sec. 4.15a). 


b. The above definition of the derivative “in the complex domain” closely 
resembles that of the derivative “in the real domain,” to which it reduces if E is 
an open interval on the real line containing the point z) = xg (see Sec. 7.11). 


Nevertheless, there are important differences between the implications of the 


two definitions, as will subsequently become apparent. 


c. Let F be a subset of £, and let z) be a nonisolated point of F (and hence 


automatically a nonisolated point of £). Then the existence of /e(Zo) (obviously 
implies that of /r(Zo) and 


Sr (Zo) =Se (Zo), 


1.e., “differentiability relative to a larger set implies differentiability relative to a 
smaller set.” The converse is false, as shown, say, by Example 10.12b. 


d. If the function f(z) is differentiable at the point z = zg relative to the set E, 
then, as ~£ ~®, the quotient (1) is bounded by some constant C. Therefore 


Az) — flzo)| © Clz — 29] 
for all sufficiently small |z — zo] (z © £), and hence 


lim f(z) =f(Zo). 


z~Zo 
E 


Thus f(z) is “continuous at z = Zy relative to the set E” (cf. Theorem 7.12). 


e. A set G in the plane C;, is said to be open if every point z © G is an interior 
point of G, 1.e., if whenever G contains Zp, G also contains some disk |z — Zo| <>; 
centered at zp (the radius of the disk depends in general on zy). An open set G 
C, is called a domain if it is (arcwise) connected, 1.e., if every pair of points Zp, 
Z} © G can be joined by a piecewise smooth curve entirely contained in G.+ 


f. A function f(z) is said to be analytic (synonymously, holomorphic or regular) 
on an open set G = C, if f(z) is differentiable on G.t We then denote the 


derivative of f(z) simply by f(z), without explicit reference to the underlying set 
G. Obviously, if f(z) is analytic on G and if zy © G, then f(z) is differentiable at z 


= Z, relative to every set E with zp as a nonisolated point and 
Se(Zo) = (Zo). 


A function f(z) is said to be analytic at a point z = Zp (or on a set E) if f(z) is 
analytic on some open set G containing Zp (or £)). 


10.12. Examples 


a. The function f(z) =z is analytic on the whole plane C,, since 


so that f'(zo) = 1 for every z. 


b. The function f(z) = % = x — iy is differentiable at every point z = Zp relative to 
any ray E drawn from Zo, since 


S(z) —f(Zo) oo 26 


—2 arg (2-29), 
Z—2o Z—2 
and hence 
Se(Zo)=—2arg(z—Z) (ze E, z#Zp). (2) 


However, (2) shows that f(z) = 2 fails to be differentiable relative to any set 
containing two distinct rays drawn from Zp and hence relative to any domain 


containing Z,. Therefore f(z) = Z fails to be analytic at every point z= Zo : Ga: 


10.13. Basic rules for calculating derivatives 


a. THEOREM. Suppose f(z) and g(z) are differentiable at the point z = Zp relative to 


a set E, and let k be an arbitrary complex number. Then the functions f(z) + g(z), 
kf{z), fz)g(z) and f(z)/g(z) are also differentiable at z = Zp relative to E (provided 


2(Zo) # 0 in the last case), with derivatives 


(Kf(z) le=kfe(2), (3) 

Lf(z) +8(z)le=fe(z) +8e(2), (4) 

[f(z)e(z) le =fe(z)g(z) +/(zZ)ge(2), (5) 
g(z) Je gz (Z) 


all evaluated at z = Zo. 


Proof. The complex analogue of the proof of Theorem 7.13. If 


b. Example. Given any polynomial 


PZ) = G6 G2 ter ae 
and any rational function 


P(z) ag ta,z+-+++4,2" 


R(z) = La ts (ol i a 
Seat ge eS TY a 


it follows from the above theorem that P(z) is analytic everywhere, with 
derivative 


Pz) =a, + 2a,z+...+na,z"—}, 
while R(z) is analytic at every point z = Zp such that O,(Zp) # 0. 


c. THEOREM (Differentiation of a composite function). Suppose w = f(z) is 
differentiable at z = Z relative to a set E, while € = g(w) is defined on a set F 


containing all the points w = fiz) with z © E sufficiently close to Zq and is 
differentiable at w = Wo = flZo) relative to F. Then the composite function 


C= h(z) = g(fz)) 
is differentiable at z = Zq relative to E, with derivative 

hg (Zo) =8r(Wo)fe(Zo)- (7) 
Proof. The complex analogue of the proof of Theorem 7.15a. ff 
10.14. Higher derivatives. As in the real case, we define the higher derivatives 
f°%D]=P Os (2=1,2,...) 


of a complex function f(z) inductively, assuming that the derivatives in question 
exist. 


THEOREM. /f f(z) and g(z) have derivatives of order n relative to a set E, then 
[fl2) 19? =k 9(2), 

f(z) +a(2) NP =f2(z) +8? (z) 

for arbitrary complex k. 


Proof. The complex analogues of the proofs of Theorems 8.12a and 8.12b. ff 


10.15. Differentials 


a. Just as in the real case (Sec. 7.31), the increment of a complex function w = 
f(z) differentiable at a point z = Zp relative to a set E can be written in the form 


Aw = f(z + h) — flZ9) =f (Zo)h + e(A)h(h = z — Z9), 


where e(h) — 0 as h — 0. Suppose we denote the quantity /’-(zp)h, called the 
principal linear part of the increment Aw, by dw(zp), or more briefly by dw or df 
while writing the increment / of the independent variable z as d;z or simply dz. 
Then the quantities dw, df, dz, etc. are called differentials, and the derivative 
f (Zo) can be written as a ratio of differentials: 

_dw_ df 


Se(Zo ee = 


(ch Bee, 7,02). 


b. The rules for calculating derivatives summarized in Theorem 10.13 lead to 
corresponding rules for calculating differentials. In fact, multiplying both sides 
of formulas (3)-(6) by dz and dropping the subscript F, we get 


d(kf) =k df, (3’) 
d(f+g)=dftdg, (4') 
d( fg) =f dg+g df, (5‘) 

J\ _ s¢—fag . : 
(2) #54 (g(Z9) #9) (6’) 


c. Multiplying both sides of formula (7) by dz, we get 
df = hg(zq) dz=g(wo)fz(Zo) dz=gp(wo) dw. 


Thus the differential of a function does not depend on whether its argument is 
the independent variable or a function of some other independent variable (cf. 
Sec. 7.34). 


10.16. The Cauchy-Riemann equations 


9 


a. First we introduce the concept of a “partial derivative,” anticipated in the 


footnote on p. 357. Let F(x, y) be a real function of two real variables x and y, 
defined on some open set G in the xy-plane, and let (x9, yo) be a point of G. Then 


the limit 

li F(x +h, 9) — F (xo, ¥o) 
im—————— 
h-O h 


(if it exists) 1s called the partial derivative of F(x, y) with respect to x at the point 
(Xo, Yo), denoted by 


OF (Xo; Yo) 
Ox 


or F’,(X9, Vo), while the limit 
ij F (x9, Jo +h) — F(x0;.o) 

eS SSS 
h+0 h 


(if it exists) 1s called the partial derivative of F(x, y) with respect to y at the point 
(Xo, Yo), denoted by 


GF (Xo; Vo) 
dy 


or F(Xo, Yo). 
b. reorem. Suppose the functiont 
w= fiz) = uz) + iv(z) = ux, y) + iv, y) 


is differentiable at the point z = zy) = X9 + iv relative to a set E containing a pair 
of line segments through Zp parallel to the real and imaginary axes. Then the 
partial derivatives of u(x,y) and v(x,y) at the point (X9,vo) satisfy the pair of 


equations 
Gu(Xo; Vo) - Gv(Xo, Vo) 
Ox ay (8) 
dv(xo;¥o) cs u(X9, Jo) 
Ox gy 


known as the Cauchy-Riemann equations. 


Proof. First let z = zy + h © E, where h is real. Then 


m1) =f leo) —f(Zo)_,. “ot = u(z o) 4 m 2620 +h) —v(Z9) 
SE(Zo) = Se a 
Z—Zo h-0 pone h 
= al +h, yo) —u(Xo, Yo) +ilim u(x9 +h, ¥o) —2(Xo, Yo) 
h-0 h h-0 h 
= u(x, Jo) 4 j22%0I0) ; (9) 
Ox Ox 
since f’ (Zp) exists. Next let z= zg + ih © FE, where h is again real. Then 
faq) elim LO ALGO) _ hig MZ0tIM) HC) 5 toy AZo +H) —0C20) 
229 h-~0 th h-+0 . 
sites eee —Uu(Xo, Vo) os U(X, Vo + th) —v(Xo, Vo) 
h--O th ire A 
. u(X9, Vo) 0v(X9; Yo) 
— 70) 10 
i By ~ ay (10) 


Equating real and imaginary parts of (9) and (10), we get (8). fi 


The nub of the proof is that, in keeping with Sec. 10.11c, we must get the 
same value of /’-(z)) whether z approaches zy along a line parallel to the real 


axis, as in (9), or along a line parallel to the imaginary axis, as in (10). 


10.17. In particular, if f(z) is analytic at a point z = Zp, then the Cauchy Riemann 


equations 
ud» __ a 
dx dy’ dx oy 


hold at every point of some neighborhood of zy (cf. Sec. 10.11). The converse is 


also true, provided we impose an extra condition on the partial derivatives of u 
and v: 


THEOREM. Suppose the Cauchy-Riemann equations hold in some neighborhood U 
of a point Zp, and suppose the partial derivatives Ou/Ox, Ouldy, Ov/Ox, Ov/Oy are 


continuous on U. Then f(z) is analytic at the point z = Zo. 


Proof. Let h = p + ig be such that z) +h © U, and consider the increment 


S(Zo +) — f(Zo) = [u(*0 +P,¥0 + 9) +10(%0 +P,%0+9)] 
- [u(xo,%o») + 10(Xo, Yo) J 
=[u(%o +P, %0 +9) —4(%o,¥o)] 
+i[0(x9 +P,0 +9) —2(%0,¥o)]. (11) 
We can write the first term in the right-hand side of (11) as 


u(xo +P; %o +9) —4(*oIo) =[u(%o +h,70+ 9) —4(%oI0+9)] 

+ [u(%9,.%0 +9) — "(Xo %o)]; (12) 
and similarly for the second term. Applying Lagrange’s theorem (Sec. 7.44) to 
the first term in the right-hand side of (12) regarded as a function of x with y = yo 
+ q fixed, we get 

u(xo +P, Jo +9) —U(Xo,o+]) =Ux(Xo+9; pot 9) (0<0,<1), (13) 
where uw, denotes the partial derivative of u with respect to x. Similarly, applying 
Lagrange’s theorem to the second term in the right-hand side of (12) regarded as 


a function of y with x = Xp fixed, we get 

U(X9, Yo +9) —4(Xo Yo) = %y(*0sP0 + 929)9 (0<@,<1), (14) 
where u,, denotes the partial derivative of u with respect to y. Substituting (13) 
and (14) into (12) and treating the second term in the right-hand side of (11) in 
the same way, we find that 

J (20 +4) —f(Z0) = [4x(*0 + 91 bs ¥0 +9) P + My(%0s%0 + 929) 9] 

+ 1[0, (Xo +43), Vo + 9)P +2,(%0, Xo + 949) 9), (15) 
where the numbers 6), 05, 03, 0, all lie between 0 and 1. 
We now use the Cauchy-Riemann equations to write (15) in the form 


J (Zo +4) —f (Zo) = [ux (%0 + 16,90 + 9)P + Uy (%0,.¥0 + 29)9] 
+i[U, (Xo, %o + 949)q —Uy(%o +930, 0+ 9)/)- 
By the assumed continuity of uv, and u,, this can be transformed further into 
SZ +h) —f (Zo) =4x (0,00) (P+ 2g) — tu, (X05I'0) (Pp +79) 
+; (b,q)b + &2(b.9)9; 


where &, (p, g) and €,(p, g) both approach 0 as p > 0, g — 0 (Le., as Jp? +g? 
0). It follows that 


S(Zo+h) —flzo) _, | 1 (p.9)P + €2(0.9)9 


X9;Jo) — iy (Xo, Yo) + 


h . priq 
But 
a _ ae ™ 
and hence 
€1(b.q)p + €2(P.9)9 


ptig [<les(o0)l+lea(P)b 


where the expression on the right approaches 0 as p — 0, g — 0. Therefore 


en fezZot4) —flz0) _ 
— = —_ 


h-O 


u,(Zo) ae tu, (Zo) =u,(Zo) +iv,(Z9), (16) 


i.e., f(z) is differentiable at z = zo, with derivative (16). The same argument 
shows that f(z) has a derivative at every point sufficiently near z = z). Hence f(z) 
is analytic atz=z. Il 


By introducing the concept of a differentiable function of two variables, we 
can establish necessary and sufficient conditions (involving the Cauchy Riemann 
equations) for a complex function to be differentiable at a point (see Problem 6). 


10.18. Harmonic functions. It will be shown below (see Theorem 10.34) that 
every analytic function f(z) = u + iv has a second derivative (and in fact 
derivatives of all orders). Arguing as in the proof of the Cauchy Riemann 
equations themselves (Theorem 10.16b), we find that 


ie wre Atom ey Se am 
PAU OV =F(F +2) =a ti 


while, on the other hand, 


- oA 1 Ou du 67u—-«.?v 


where 67u/dx* denotes the second partial derivative of the function u(x, y) with 
respect to x (i.e., the second derivative of u(x, y) regarded as a function of x with 
y held fixed), 67u/dy* denotes the second partial derivative of u(x, y) with respect 


to y, and similarly for 6*v/Ax* and 67v/dy*. Equating real and imaginary parts of 
(16) and (17), we get 


07u  07u d7u d*u 
= 0, _—_—, 4- _—_= = 0. 
dat * By Ox2 * Bp? 
In general, any continuous real function g = (x, y) satisfying the equation 
op oa 0° 3 
Ox? dy? 


(known as Laplace’s equation) at every point of a domain G is said to be 
harmonic on G. Thus we see that the real and imaginary parts of a function 
analytic on a domain G are harmonic on G. 

Conversely, it can be showny that every function u(x, y) harmonic on a 
domain G is the real part of a function f(z) analytic on G, or the imaginary part of 
the analytic function if(z). Two harmonic functions u(x, y) and v(x, y) are called 
conjugate harmonic functions if they are the real and imaginary parts of a single 
analytic function f(z), 1.e., if they are related by the Cauchy-Riemann equations. 


10.2. Line Integrals of Complex Functions 


10.21. Line integrals of the form 


J(P) ag(P), (1) 


where L is a piecewise smooth path with no singular points, have already been 
encountered in Sec. 9.121 for the case of real f(P) and g(P).f We now generalize 
the definition of (1) to include the case of complex functions 


KP) =f\(P) + ify(P), g(P) = 8 (P) + iga(P), 


by the simple expedient of setting 
| fe) de(P) = | f (P) de,(P)—f,(P) de, (P) 
+i| fi (P) de,(P) +f,(P) de, (P), (2) 


which is in keeping, formally at least, with formulas (6) and (7), pp. 363-364. 


Note that (2) reduces to 

J, AP) dg(P) =I, A(P) dg(P) + il, AP) dG(P) 

if {P) is complex and g(P) real, while (2) reduces to 
J, AP) dg(P) = J, AP) dg\(P) + il, KP) dgy(P) 


if f(P) is real and g(P) complex. 
Now let LZ be a plane curve, and let 


Kz) = flx, y) = u(x, y) + iv, y), BZ) = z= x + iy. 
Then (2) becomes 
| fie) 4e= | udev dy+i| u dy +v dx (3) 
L L L 


(with P = z). We can also give a direct definition of the integral (3). To this end, 
let II be a partition of Z with points of subdivision z, =x, + iy,. Then 


a-l 
fe) dz= lim )' f(z)Ay, (4) 


L 4(M1)+0 k=0 
where Az; =z, 41 — Z, and 
d(II) = max {|Azol, |Az,|, sels |Az,, — il}. 


In fact, the right-hand side of (4) is just 


n-1 n-l n-1 aml 
2, U(x,, 9) Ax, — 2 O(Xqs Ye) AM, +2 2 Mle) Ar e id U( X45 0) AX, 


which clearly approaches the right-hand side of (3) as d(II) — 0 (cf. Theorem 
9.73 and Sec. 9.121). 
The following properties of “complex line integrals” of the form 


ie) dz 


are all immediate consequences of the definition (4): 
(a) If a), G are arbitrary complex numbers and /;,(z), f(z) arbitrary integrable 


functions, then 


Jy [ayf\(2) + ofa(2)] dz = 4 Jp fi@)dz + a9 J) 2) az. 
(b) If Z is made up of two arcs L, and L, “joined end to end,” then 
In.AZ) dz = Jy, flz) dz + Ji, fz) az. 

(c) If the path L is of length /, then 


[oe 


< |, L/(2)| de<t sup |f(2). (5) 


10.22. a. THEOREM. Let f(z) be a sequence of functions, each continuous on a 
piecewise smooth path L, and suppose f,(z) converges uniformly on L to a limit 
function f(z). Then 


lim | f,(z) dz= | f(z) dz. (6) 
noo JL L 
Proof. The integrability of f(z) follows from its continuity (cf. Theorem 9.55b), 
which in turn follows from the fact that f(z) is the “uniform limit” of a sequence 
of continuous functions (see Corollary 5.95b) . To prove (6), we deduce from (5) 
that 


(, M2) ~Je(2)] de| <tsup [fl2) —f(2)h (7) 


where / is the length of Z. But the right-hand side approaches 0 as n — «, by the 
assumed uniform convergence of f,,(z). 


b. tHeorem. Let f(z, 4) be a sequence of functions, each continuous on a 


piecewise smooth path L and dependent on a parameter 1 varying over some set 
A, and suppose f,(z, 2) converges uniformly on the direct product L x A to a 


limit function f(z, 4). Then the sequence of functions 
F(A) = Jr fyl@, A) dz 

converges uniformly on XK to the function 

F(A) =J, fz, A) dz. 


Proof. This time we have 


[,Wea) (edd) del <b sup fled) — (20) 7’ 
: ah 
instead of (7). But the right-hand side approaches 0 as n — o, by the assumed 


uniform convergence of f,(z,A). 


10.23. Next we prove a result closely related to the mean value theorem for 
ordinary integrals (Theorem 9.15f) : 


THEOREM. [f f(z) is continuous on a piecewise smooth path 
L= {z:z=2(t), a St &b} 


and if z'(to) # 0, then 


lim 
z—~Zo0 Z—Zo 


| fiz) dz=f(29), (8) 


L(z0,2) 


where L(Zp, Z) is the arc of L joining Zp to z. 


Proof. If f(z) = f(zo) = const, then (8) is an immediate consequence of (4). Hence 
in the general case we can assume that f(z)) = 0, since otherwise we need only 
replace f(z) by f(z) — f(zp). Given any ¢ > 0, choose 6 7 0 such that |z — Zo <6,z 
© TF implies |Az)| < . Then, by (5), 


Wiley Ae) dz| Se 5(Zo,2), 


where s(Zo,z) is the length of the arc L(Zp, z), and hence 


l 


$(29,2) 
= {ft dz (9) 


<E , 
|Z—Zp| 


But 


li $(Z9,Z) cain ia 


z~Zo |z a Zo| 


by Theorem 9.73, and hence for sufficiently small |z — zp] we can replace (9) by 
the estimate 


: | fz) dz 
Z—Zo JL(z0,2) 


which immediately implies (8). ff 


26, 


10.24. rHrorem. Let f(z, ¢) be a function of two complex variables z and ¢, where z 
= x + iy varies over a piecewise smooth path L in the z-plane and ¢ = € + in 
varies over a piecewise smooth path A in the ¢-plane. Suppose fiz, ¢) is 
continuous in both arguments z and ¢.+ Then each of the integrals 


KO=I fz. dz, Mz)=I, fz, 0) a 


is continuous, the first on A and the second on L, and 


[4 [Meo ah acm | ff seo ael a. 


Proof. An immediate consequence of Theorems 9.111 and 9.112, since complex 
line integrals can be expressed as “linear combinations” of ordinary real integrals 


10.25. a. rHeorem. Jf f(z) is differentiable on a set E containing a piecewise 
smooth path L, with initial point zy and final point z,, then 


If 'u() dz = fle) — Zo). 
Proof. Let 
z= 2(1\(t StS) 
be the parametric representation of L, where z(é) is piecewise smooth. Then 
| i@ ae | Feme t= \. fey at 
=f(z(t,)) —flz(to)) =f(z1) —F(20) 
(justify the various steps). 


b. rneorem. Jf f(z) is analytic on a domain G, then 


Rac dz=f(z,) —f(20) (10) 


for every piecewise smooth path L = G joining Zo to 2}. 
Proof. An immediate consequence of the preceding theorem. ff 


c. THEOREM. [f f(z) is analytic on a domain G, then 
} f'(z) dz=0 
L 


for every piecewise smooth closed path L= G. 
Proof. Let z; =z in(10). fl 


d. Example. Given any polynomial 
P(Z) = dag tayz+-+* +a,z" 


and any closed path L — C,, we have 


P(z) dz=0, 
L 


since P(z) is the derivative of the polynomial 
7? +1 

P, (z) oor ube 2 Bice “ay 

which is clearly analytic on the whole plane C,. 

10.26. Next we prove the converse of Theorem 10.25c: 

THEOREM. Let ~(z) be continuous on a domain G, and suppose the integral 


J, @(z) dz 


of v(z) along every piecewise smooth closed path L — G vanishes. Then there 
exists afunction f(z) analytic on G such that f(z) = o(z). 


Proof. Let L, = G be any piecewise smooth path joining a fixed point Zo EGtoa 
variable point z © G,+ and let 


fa)= { o(Qat. (11) 


Then f(z) is independent of the choice of L,. In fact, if Zopz and Zopz are two 
paths joining Zp, to z, as shown in Figure 64, then 


{0 a+ fos o() a= = g(t) a=0, 


Zopz 


z2qz0 ZoPz4qzo0 


Figure 64 


by hypothesis, since the path ZoP242Zo is closed, and hence 


les 9(0) d= he g(t) a= hes o(0) a. 


Zopz ZqZ0 Zoqz 


Thus it makes sense to write just 
fa)= | 00) & (11) 


instead of (11). 
To prove that f(z) is analytic on G, with derivative f'(z) = g(z), we note that 


zth zt 


fleth)—flz) = | o(t) dt 


zo 


o(0) a [0 tte | 
-( *" o(2) d+ [tow ~9(z)] df 


=ho(2) + | [o() —o(2)] de, 


where the last integral can be evaluated along any piecewise smooth path in G 


joining z to z + h. If |h| is small enough, this path can be chosen to be the line 
segment joining z to z + h. But then, by (5), 


fle DAS _ (a 


zth 
-| i [o(t)—o(2)) dt 


P le(f)-¢(z)I, 


<—|A| su 
lA ¢-z1< In 


where the expression on the right approaches 0 as h—0, by the continuity of 
e(z). It follows that 


f'(2) slim Lt 9S) _ gz), 


h>0 h 


10.27.1HEorEm. Let f(z,C¢) be a function of two complex variables z and 6, where z 
= x + iy varies over a domain G in the z-plane and © = € + in varies over a 
piecewise smooth path A in the ¢-plane. Suppose fiz,C) is analytic on G for every 
fixed G © A, while both f(z,C) and its partial derivative L(,C) with respect to z are 


continuous on G x A. Then the function 
(2) = J, lz.6) dC 

is analytic on G, with derivative 

9(2) = In F(.0) al 


Proof. It is easy to see that 


¢ y(z) dz=0 
L 


for every piecewise smooth closed path L = G. In fact, by Theorems 10.24 and 
10.25c, 


p02 tmp | | f(e0 ac dz= [Ap £20 az} at =0. 


Hence, by Theorem 10.26, there exists a function ®,(z) analytic on G, with 
derivative ®,)'(z) = e(z), where 


9(2)= [oe dr= [" J [ etn achar= [4 ("stan arb 
= | fas) &— | eo a, 
A A 
again by Theorem 10.24. Therefore ®,(z) differs from @(z) by only a constant, 
so that ®(z) is itself analyticon G. fl 


10.28. By an integral of the Cauchy type we mean an expression of the form 
F2)=(LOQa (ea, (12) 
JA 4 —2Z 


where f(C) is analytic on a piecewise smooth path A — C,. Clearly F(z) is defined 
at every point of the open set G = C7 — A (G may not be connected and hence 
may not be a domain). 


THEOREM. The function F (z) defined by (12) is analytic on G, with derivatives of 
all orders on G (themselves analytic), given by 


F(z) =n! | go dl (n= 1,2,...). 


Proof. For every fixed G © A, the function 
fz) = 22 

C-—z 
and its partial derivatives with respect to z 


" at JC) _ 
£ (2b) =m) ar (n= 1,2,...) 


are analytic on G. Moreover, the function f(z,0) and all its derivatives f,(z,C) are 
continuous on G x A. Now apply Theorem 10.27. & 


10.3. Cauchy’s Theorem and Its Consequences 


We now present the central results of the theory of analytic functions, due 
mainly to Cauchy. 


10.31. Let L be a curve with parametric equations 
x = x(t),v = y(t) (aStSb), 


where x(f) and y(t) are continuous. If the same point (x, y) © L corresponds to 
more than one parameter value in the interval a St < b, we say that (x, y) is a 
multiple point of L. A curve with no multiple points is said to be simple.* It is to 
allow for the possibility of simple closed curves, i.e., simple curves whose end 
points coincide, that we consider parameter values in the halfopen interval a St 
< 5 rather than in the closed interval a < t Sb. Every simple closed curve L 
partitions the plane into a bounded domain (called the interior of L) and an 
unbounded domain (called the exterior of L). This entirely plausible result, 
known as the Jordan curve theorem, is proved in elementary topology. 

A domain G is said to be simply connected if whenever G contains a simple 
closed curve L, G also contains the interior of L. Thus an open disk or, more 
generally, any convex domain is simply connected,t but not the annulus (1.e., 
ring-shaped domain) r, “ |z| <r, shown in Figure 65. 


10.32.rTHEorEM. (Cauchy’s theorem). /f f(z) is analytic on a simply connected 
domain G, then 


Figure 65 


p J(z) dz=0 
L 


for every piecewise smooth closed path L= G. 


Proof. According to Theorem 9.125b, the integral along L is a limit of integrals 
along polygonal lines Zy inscribed in L. Hence we need only prove the theorem 


for closed polygonal lines,t in fact for a closed polygonal line with no “self- 


intersections” (why?). 
Thus let LZ be a simple closed polygonal line. Since G is simply connected, we 
can partition Z and its interior into a finite number of triangles 7), ..., T,,, each 


contained in G. Then 


Je) iz= | f(z) det f(z) dz 
L Tis 


T; 


(where for simplicity we use the symbols 7}, ..., 7,, both for the triangles and 


their boundaries), since the integrals over the sides of the triangles inside L 
cancel each other out, each such side being traversed twice in opposite directions 
(see Figure 66). Hence we need only prove Cauchy’s theorem for a triangular 
contour. 

To this end, let 7 be a triangle contained in G, and let 


=p S(z) dz 
T 


(as before, 7 denotes both the triangle and its boundary). Suppose we partition 
the triangle 7 into four equal triangles 7|, T>, T3, T, by joining 


aN 
\ = 2SIES 


— 


Figure 66 


Figure 67 


the midpoints of consecutive sides of T by line segments, as shown in Figure 67. 
Then 


} fe) ée=h f(z) te+h f(z) te+o I (2) de+h J(z) dz, 
i Ti T2 T3 T4 


and hence at least one of the smaller triangles, say 7;, must satisfy the inequality 


pf dz|\> Ae 


Dividing the triangle 7; in turn into four equal triangles 7;,, ..., T; 


(1) 


4, we then find 
a new triangle, say T,., such that 


ij? 
7 flz) dzl> > HH 


NE eal 


4°" 
Continuing this process indefinitely, we get a system of “nested triangles” ’ each 
containing the next, such that 


f(z) dz > 
7% 


Each triangle 7," is a compact set in the complex plane, and hence there is a 


point zy contained in all the triangles 7," (see Sec. 5.14e and Theorem 3.98). The 
function f(z), being differentiable on G, is certainly differentiable at z = zp. 
Therefore, given any ¢7 0, there is a5 7 0 such that |z — Zo < 8 implies 


J (2) =f (Zo) +f" (Zo) (Z— Zo) +4(2Z,Zo)5 (2) 
where 
|a(Z, Zo)| <¢ Iz — Zo 


(cf. Sec. 10.15a) and all the terms in (2) are obviously continuous (and hence 
integrable). Let 7x be such that 7p is contained in the disk |z — zp| < 8 (this is the 


case for all sufficiently large n). Then 


} J (2) te=p J (Zo) desea) (Z— Zp) der a(2,Z9) dz, 
Th Th TS Ti 


where the first two terms on the right vanish, by Example 10.25d. Since the 
perimeter of 7,," equals //2”, where / is the perimeter of the original triangle T, it 
follows that 


M S(z) dz -|p a(Z,Z9) dz 
Ts TS 


where we use the fact that the distance between any two points of a triangle must 
be less than the perimeter of the triangle. Comparing this inequality with (1), we 
find that 


Ld <€é e 
~-0Uhm 


ef ae 
Con Dm 


or 

|| <el’. 

But then / = 0, since ¢ can be made arbitrarily small. ll 
10.33. Cauchy’s formula 


a. The deleted neighborhood (or “punctured disk”) 0 < |z — a| < R is an example 
of a multiply connected domain, 1.e., of a domain which is not simply connected. 
The function 

1 


oa, (n=1,2,...) 


is analytic on this domain, but not on the whole disk | z — a| < R. To evaluate the 
integral 


dz 
L(z—a)” 
where L is a circle of radius r < R centered at the point a (and traversed in the 


counterclockwise direction), we choose the polar angle 6 as parameter, so that 


z—-a=re(z— a)" =r"el" dz = ire” dé. 


We then have 


i(1 —n)0|2 
wae - 


2x - 7 = 
_ az =irt-*| gi(l—n)8 79 — ir i(1—n) =n) . 0 ifn, 
L(z—a)" 0 


ig|2*=2ni ifn=1. 


The fact that this integral equals 277 # 0 if n = 1 shows that the requirement that 
the domain G be simply connected cannot be dropped in Cauchy’s theorem. 


b. Next we establish an important general property of integrals along closed 
paths in the same domain. Let f(z) be analytic on a domain G (which is in 
general multiply connected), and let L,, L, be two piecewise smooth closed 
paths in G such that L, surrounds L,.¢ Suppose and L, and L,, together with a 
piecewise smooth arc I joining them, form the boundary of a simply connected 
domain G* — G, and let L be the boundary G*, made up of the paths L), L, and 
the arc I’ traversed twice in opposite directions (see Figure 68). Let L be 
assigned the direction shown in the figure, corresponding to successively 
traversing L, in the counterclockwise direction, I’ from right to left, Z, in the 
clockwise direction, and finally I from left to right (this arc is denoted by — I). 
Then, by Cauchy’s theorem, 


om fe) exh I (z) der | fe ict f(z) dz+ | f(z) dz 
L Ly r Li -r 
(concerning the meaning of the integrals with arrowheads, see Sec. 9.123). But 


reversing the direction of integration changes the sign of a line integral- (cf. p. 
363). Hence the integrals along I and — I cancel each other out, and we are left 


with 
7 f(z) dz= > J(z) imp S(z) dz. (3) 


c. THEOREM. Let f(z) be analytic on a simply connected domain G containing a 
point 


Figure 68 


Zy and a piecewise smooth closedpath L surrounding zy. Then 


rt Sf 
I (Zo) ~aa. Z—Z, dz, 


a result known as Cauchy’s formula. 


Proof. Let 
I= Sz) dz. 
L2z—2o 


Then the function 


I (2) 


(4) 


(5) 


in general fails to be differentiable at z = zp, so that / is in general nonvanishing. 
But the function (5), like f(z) itself, is analytic at every point of G except Zp. 
Hence, according to Sec. 10.33b, the integral J does not change if the contour L 


is replaced by a circle y, of radius ¢ centered at the point zp). Now let 


Kz) = flZo) + OZ). 
Then the function 


y(z) 


b 
Z—Zp 


(6) 


like f(z) itself, is continuous (and hence integrable) for z # z), and approaches the 
limit f'(zo) as z — Zp. Hence (6) is bounded in a neighborhood of Zo, so that 


y(z) 
Z—Zo 


<M, 


say. It follows that 


fis F(z) 4 
ve 0 
and hence 
I=flea) “+6 ae (7) 


The first integral on the right equals 2nif(zp), by Sec. 10.33a, while the second 
integral satisfies the inequality 


e(z) 3. 
Yc a= Zo 
by formula (5), p. 383, and hence approaches zero as ¢ — (0). But the other two 
terms in (7) are constant, and hence 


l= 2nif(Zo) 


or 


<2nMe, 


which is equivalent to (4). fi 


Cauchy’s formula (4), expressing the value of an analytic function f(z) inside a 
contour L in terms of the values of f(z) on L (via a line integral along L), is one 
of the most important results of the theory of analytic functions. 


10.34. THeorem. /[f f(z) is analytic on a domain G, then f(z) has derivatives of all 
orders on G (themselves analytic), given by 


. n! F (C) _ 
fz => G-a dl (ten 12,005); 


where the integral is along any piecewise smooth closed path L - G 


surrounding the point z. 


Proof. An immediate consequence of Theorem 10.28 and Cauchy’s formula, 
which shows that f(z) can be represented on G as an integral of the Cauchy type. 


10.35. a. THEorEM. Every function f(z) analytic on a disk K = {z:|z — Zo| <r} can 


be expanded in a convergent power series 
@ 
_— n 
aie z a,(z re Zo) 
n=0 
on K. 


Proof. Starting from Cauchy’s formula 


f(0) 
S(2)= a t-2 dt 
valid for any circle L = {¢: | — z)| = 7 — 6} and any point z inside L, we 
write 
1 1 l a. (==) 
(=z C—Zo,)_ 2—Zo C—Zon=0\C —Zo) 


This series converges uniformly in ¢ on L, because of the estimate 


Z—Zo 


C—2Zp 


_ |2—Zol 


| 
r—d 


and Weierstrass’ test (Theorem 6.53). It follows from Theorem 10.22a that 


vie f(%) < (=2) dt 


—_— a n S(f) dt 
Le Zo) fe ee Zo)" Qs 


(8) 


which is of the form (8) with 


I ACS) r 
a,= rh Mea dt. vi (9) 


b. tHeorem. Every function f(z) analytic on a disk K = {z: |z — Zo| <r} can be 
expanded in a convergent power series of the form 


roa) (n) 
f2)= ¥ Po) (2 29) (10) 
on K, known as the Taylor series of f(z) on K (cf. Sec. 8.5). 


Proof. Because of Theorem 10.34, the “Taylor coefficients” (9) can be written as 


{n) 
a, = F (Zo) ‘ | (9') 
n! 
c. corotuary. [f f(z) is analytic on a disk K = {z: |z — Zo| <r} and iff” (zy) = 0 for 
alln =0, 1, 2, ..., then f(z) = 0 on K. 


Proof. An immediate consequence of (10). & 


10.36. a. A sequence (or series) of functions defined on a domain G is said to 
converge uniformly inside G if the sequence (or series) converges uniformly on 
every compact set Q contained in G.T 


THEOREM. If Q is a compact set contained in a domain G, then G contains a 
piecewise smooth closed path L surrounding Q. 


Proof. Every point Z9 © Q, is the center of an open disk K = {z: |z — Zo <r) such 


that G contains K and its boundary, and the union of all such disks clearly 
contains Q. By the finite covering theorem (Sec. 3.97), there is a finite number 
of such disks Kj,...,K,, such that OQ - K, V..YK,,. But the boundary of the 


domain K, VY... K,, = G-— Qis a piecewise smooth closed path L (made up of a 
finite number of circular arcs) surrounding O. I 


b. tHEorem (Weierstrass’ theorem). /f f,(z) is a sequence of analytic functions 
converging uniformly inside a domain G to a limit function fiz), then fiz) is 
analytic on G. Moreover, the sequence of derivatives f(z) of any order m 


converges uniformly inside G to the corresponding derivative f(z) of f(z). 


Proof. By Cauchy’s formula, 


Sulo) 


eee r 
Il2)= FP c—2 & 


for any piecewise smooth closed path L = G surrounding a give point z © G. But 


l a 

SsuD ivr" 1 = ‘es 

ae ¢—z 6 
where 

6=inf |€ —zl, 
CeL 


while the sequence f,(C) converges uniformly to f(C) on L (as n — ©), since L is 
a compact set. It follows that 


Jnl) 
(-—z 


converges uniformly on L to 


£(Q) 
—, 


and hence 


fe)= Fgh Toe & 


by Theorem 10.22a, i.e. f(z) can also be represerted on G as an integaral of the 
Cauchy type. Therefore f(z) is analytic on G, by Theorem 10.28. 

To prove the second part of the theorem, we note that, by Theorem 10.28 
again, 


* m! S(0) 
f(2)= a Goaeie 


” tA) 
BP (z)= ot ee ((—z)"*! dt, 


where the integrals are along a piecewise smooth closed path L — G surrounding 


any given compact set O — G (the existence of L follows from the preceding 


theorem). But this time 


l | 

Sup |7>—\mT1| & Smt T? 
akon 
zeQ 
where 
5=inf |€ —zI, 

feL 

zeQ 


while the sequence 


ACS) 


(¢—z)™*? 


converges uniformly on L x QO to 


AO) 
(C —z)"* 1 


Hence, by Theorem 10.22b, 


f(z) = lim fy”(z); 


where the convergence is uniform on every compact set O— G. 
10.37. a. Next we prove the converse of Theorem 10.35a: 


THEOREM. Let r be the radius of convergence of the power series 
@ 
f(2)= dL aa(2 — Zo)". 
a= 


Then f(z) is analytic on the disk K = {z: \z — Zo| <r}, with derivative 


f'(2) = ¥na,(2— 20)" 


on K. 


Proof. The power series (11) converges uniformly on any disk 


K, = {z: |z—z9| S74} 


(11) 


(12) 


with 7; <r (see Theorem 6.65a). But every compact set OQ — K is contained in 
such a disk, since otherwise we could find a sequence of points z, © Q, such that 
IZ, — Z9| ~ 7, which is impossible, since the sequence z, would then have a limit 
point z* © Q,+ where |z* — Zo| = r, contrary to the condition Q = K. Hence the 
sequence of partial sums 


flz)= ¥au(2—70)" 


of the series (11) converges uniformly inside K to the function (11). But the 
functions f(z) are obviously analytic on K, and hence so is f(z) itself, by 
Weierstrass’ theorem. The validity of (12) is an immediate consequence of the 
second part of Weierstrass’ theorem. 


b. In particular, the above theorem implies the following differentiation 
formulas, valid for all complex z: 


2 , 2 
(y'=(142454-~) Hltzt 5 tee, (13) 
’ : 2 2 ’ z? z4 
(sin z) =(2-5+5 +) slat ae | Mian z, (14) 
zs = ‘ ae 
(e08 2)'= (1-5 +5 -~) a- 2th Fp sins, (15) 
‘ . es : ae 
(sinh z) -(2+5+5+~) alt+s +7) t--mcosh z, (16) 
: z? zt , 3 2 . 
(cosh z)’= Lie tat eEt tote sinh x (17) 


We could just as well have deduced (14)-(17) from (13), by using the known 
expressions for the trigonometric and hyperbolic functions in terms of the 
exponential function (Secs. 8.63 and 8.72). 


c. There is a natural connection between the radius of convergence r of the series 
(11) and the “analyticity” of f(z). In fact, r is just the radius of the largest disk 
centered at zy on which f(z) is analytic, since, by Theorem 10.35b, f(z) 1s the sum 


of its own Taylor series on every such disk. Thus, for example, the radius of 
convergence of the series 


=]—z?+4+z*-—z°+... (18) 


J (2)= 


equals 1, since the distance from the point z) = 0 to the nearest point at which 


1+2z? 


fiz) fails to be analytic, i.e., the point z = i or z = —i, equals 1. It would be hard to 
relate the divergence of the series (18) at the points z = + 1 to the behavior of the 
function f(z) for real z = x only. 

Points at which a function f(z) fails to be analytic (like the points z = +7 in the 
above example) are called singular points or singularities of f(z). 


d. A point Zp is called a zero (or root) of a function f(z) analytic on a domain G 
containing Zp if f(z) = 0. Moreover, we call zg a zero of order (or multiplicity) k 
if 

AZo) =f (Zo) = 0° =f*~ Eq) = 0, fo) # 0. 


Thus an analytic function f(z) with a zero of order & at the point z = zp has a 


Taylor series expansion of the form 


(k) Zz k+1) 
=f) 2 (2 z9)h hee) (2— zp) 1 4 


in a neighborhood of zp. This can also be written as 
flz) = (2 — zo)Sg(2), 
where the function 


fo) PP i (Zo) 
k! (k+1)! 


g(z)= (2-29) + 
is analytic in a neighborhood of zp, being the sum of a convergent power series, 
and hence is analytic on the whole domain G, being the quotient of two analytic 
functions f(z) and (z — zo)*. Note that g(zo) is just 


#(z0) = 7 f (zo) #0. 


10.38. A function f(z) is said to be entire if it is analytic on the whole complex 
plane C. According to Theorem 10.35, an entire function f(z) has a Taylor series 
expansion 


f(z) = Yaz" (29 =0) (19) 
which converges for all z © C. 


THEOREM (Liouville’s theorem). /f f(z) is an entirefunction and ifif{(z)| <M for all 
z © C, then f(z) = const, i.e., every bounded entire function is a constant. 


Proof. According to Theorem 10.35b, the coefficients of the series (19) are given 


by 
a, = mA | Js) dl, (20) 


where we choose L to be a circle of radius R centered at the origin. Using the 
estimate (5), p. 383 and the fact that |f{(z)| <M for all z © C, we have 


1 M M 
— —— 27R= —. 
Qn R"*} us R" 


la,|< 


But R can be made arbitrarily large, since f(z) is entire, and hence 
a, = O(n = 1, 2, ...), 

so that 

f(z) = dg = const. | 


10.39. a. rHeorem (Uniqueness theorem for analytic functions). Jf f(z) is 


analytic on a domain G and vanishes on a sequence of points Zo, ...5 Zp) + 


converging to a point Zp © G, then fiz) = 0 on G. 
Proof. By Theorem 10.35a, f(z) has a convergent power series expansion 
f(Z) =a +4; (2— 29) +4;(z—Z9)? ++ (21) 


on every disk K = {z: |z — zo <r} = G. First we show that f(z) = 0 on K. To this 
end, we introduce further power series 


F(z) =a, +4,(z—Zp) +43(Z—Z9)* + se, 
S2(z) =a, +43(z—Z9) +a4(z—2Z9)* ++, (22) 


where each new series is obtained from the preceding one by dropping the 


constant term and dividing by z — Zp. The series (22) all converge on the same 
disk as the series (21) itself, and the functions f,,(z) are all continuous at z = Zo, 
like f(z) itself (see Corollary 6.65b). It follows by induction that 


FS (Zn) =0, a =f(Zp) = lim /(z,) =0, 


Si (2) = Sen) feo) =0, a, =f; (Zo) = lim f, (z,) =0, 
Sf %) — Sm-1(2n) —fm-1(Zo) 


=0, am =fin(Zo) = lim fm (Zn) =0, 


Z,—Z0 


Hence the coefficients a,, all vanish. But then f(z) = 0 on K, as asserted. 
Having just shown that f(z) = 0 on any disk K — G centered at zo, we now 
show that f(z) = 0 on the whole domain G. Let z* be any point of G other than Zp. 


Then, since G is connected, there is a piecewise smooth path L  G joining zp to 
zg Let 


p= infr(z), 
zeL 

where r (z) is the radius of the largest disk contained in G and centered at z. Then 
p 7 0, by the compactness of L (why?). As just shown, f(z) = 0 on the disk Ko of 
radius p centered at z). Suppose we shift Kp along L by sliding its center along L 
from zy to z*, thereby generating a “chain of shifted disks,” the first centered at 
Zy and the last centered at z*. Making sure that the center of each shifted disk lies 
inside the “preceding disk” (draw a figure), we find that f(z) = 0 on each shifted 
disk, by the same argument used to show that f(z) = 0 on Ko. It follows that f(z*) 


=) and hence that f(z) = 0 on G, since the point z* © G is arbitrary. fl 


Another version of the uniqueness theorem for analytic functions is given by 
the following 


b. tHeorem. [f two functions f, (z) and f,(z) are analytic on a domain G and 
coincide on a sequence of points 2), ..., Z,, ... converging to a point Zp © G, then 


f(z) =f,(Z)on G. 
Proof. If fiz) — fiz) — fA(z), then f(z) is analytic on G and vanishes on the 


sequence Z)..., Z,, ... Hence, by the preceding theorem, f(z) = 0 on G, 1.e., f,(z) = 


f,(z)onG. 


c. The above considerations lead at once to the idea of “analytic continuation,” 
anticipated in Sec. 8.55. Suppose f;(z) is analytic on a domain G,, whiley /,(z) is 
analytic on a domain G,, and suppose the intersection of G, and G, is a 
(nonempty) domain Gp. Suppose further that f,(z) and f,(z) coincide on Gg (this 
will be the case, for example, if f,(z,,) =/5(z,,) on a sequence of points 2), ..., Z,, ... 
in Gp converging to a point zg Go). Then the function 

filz)ifzeG,, 


fens ne if ze G,, 


called the analytic continuation of f(z) from G, into G, is “singlevalued” (see 
Sec. 2.84) and analytic on the whole domain G = G, V G,. Note that the analytic 
continuation of /;(z) from G, into G is necessarily unique (why?). 


d. Example. The function 

fie= Ye (el<l) 

is analytic, being the sum of a convergent power series. The function 
flz)=— — (e#1), 


equal to f,(z) if |z| < 1, is the analytic continuation of Ff, (2) from the disk |z| <1 
into the domain C’ = C — {1} (the complex plane minus the point z = 1). 


e. rHeorem. /f f(z) is analytic on G and if f(z) = 0 for alln = 0, 1, ..., then flz) = 
0 onG. 


Proof. Let K = {z: |z — Z| <r} be a disk contained in G and centered at Zo. Then 
f(z) = 0 on K, by Corollary 10.35c. Now apply Theorem 10.39a. fi 


f. Real analytic functions. A function f(z) analytic on a domain G containing 


points of the real axis is called a real analytic function if all its values on the real 
axis are themselves real. Since we can calculate the values of the derivative of an 
analytic function f(z) on the real axis using only the values of f(z) on the real 
axis, it is clear that the derivative f(z) of a real analytic function is itself a real 
analytic function. By the same token, the higher-order derivatives f"(z), f(z), ... 
are all real analytic functions. Given any point Zp © G lying on the real axis, let 


2)= ¥a,(z— 29)" 
n=0 


be the Taylor series expansion of f(z) at zo, converging on some disk K centered 
at Zy. Then the coefficients 


are real, and hence 


FB = Yae=zF (Z—Zo)"= Yale Z—Zo)"=f(z) (23) 


for all z © K, so that /(Z) = f(z) on or near the real axis. More generally, we have 
the following 


THEOREM. Let f(z) be a real analytic function on a domain G. Then f(z) satisfies 
the condition 


f(2) =f(2) 
at every point z © G such that 2 also belongs to G. 


Proof. By retaining only those points z © G such that 2 also belongs to G, we can 
assume that G is symmetric with respect to the real axis. Let G* denote the set of 
all points z © G with Im z 7 0 and G the set of all points z © G with Im z <0. 
Suppose we introduce a new function 


f(z) =f) (zeG*), 


whose definition involves the values of f(z) for z © G. Let zy © G*, so that 70 © 


G . Being analytic on G , f(z) has a Taylor series expansion 


J(z)= 2, a,(Z—Zo)", 
valid in a neighborhood of every point 70 © G~. Hence 
fi(2) =A = YL a(Z—Zo)" -Ya a,(z— 


in a neighborhood of every point zy © G, so that f(z) is analytic on G*. But f, (2) 
= f(z) on or near the real axis, by (23). It follows from Theorem 10.39b that f,(z) 
= f(z) = fz) on the “upper domain” G*. Reversing the roles of G* and G, we 


find in the same way that /(Z) = fz) on the “lower domain” Gas well. Thus, 
finally, /(Z) = f(z) on the whole domain G. 


10.4. Residues and Isolated Singular Points 


10.41. Residues. Let L be a piecewise smooth closed path with no multiple 
points (Sec. 10.31), let D be the interior of L, and let f(z) be analytic on LD. 
Then the integral 


rip 2 dz (1) 


vanishes, by Cauchy’s theorem (Sec. 10.32). On the other hand, if f(z) fails to be 
analytic at certain points of D, called singular points (Sec. 10.37c),7 then the 
integral (1) may well be nonzero. For example, if 


fiz) = $2), 


Z—Zp 


where g(z) is analytic on L Y D, then, by Cauchy’s formula (Theorem 10.33c), 


l 
me dz=g(Zo). 
More generally, let f(z) be any function analytic on L YD except at a point zy e 
D. Then the value of the integral (1) is called the residue of f(z) at z = z) denoted 
by 
Res f(z). 


z=Z9 


Note that the value of (1) does not change if we replace L by any other piecewise 
smooth closed path L’ = D surrounding zo. 


10.42. Examples 


a. Let 


fle) = 82, (2) 
where g(z) is analytic on L Y D and P(z) is a polynomial with a simple zero at Zo 
© D (so that P(Zo) = 0, P’(Zp) # 0) and no other zeros in D. Then 
P(Z) = ay(Z — Zp) + a,(z — Zp) +... = (2 — Z9)P1(Z) 

(Sec. 10.37d), where 

P\(Z) = a) + az ~ 29) + 
is a polynomial equal to 

a, =P' (Z) #0 


at z = Zo, with no zeros in D. By Cauchy’s formula, 


ale) 
riff inane me Pay “aS Gn z)Pi(e) 


£(Zo) &(Zo) 
AEN ~ P'(Z5)’ 


so that 


g(z) &(Zo) 


Res f(z) = Res $i) ~ Pt)’ (3) 


Z=Zo Z=Z0 


b. Again let f(z) be given by (2), where g(z) is analytic on L Y D, but this time 
let P(z) be a polynomial with a zero of order k at Zp © D (so that P(@5) =... = pk 
(zy) = 0, P(zp) # 0) and no other zeros in D. Then 


P(z) = ajfz—- Zp) ca 5 | © oye" I+ = (- Zo)P,(z)(a; #0), 


where 


PZ) = ag + ays (Z— 2) F-° 
is a polynomial equal to 


P® (Zo) 
i 


a= 


at z = Zp , with no zeros in D. Expanding the function g(z)/P,(z) as a Taylor series 
in powers of z — Zp, we get 


SII = bobby (2— 2) +b3(2— 2)? 4-5 
and hence 
fey a8@ a8) bo b by 


ee 
P(z) (Z—Zo)"P,(z) (Z—Zo)"  (z—Z)*~* zZ—Zo 
+ by + by, 1(Z— Zp) Fee 


Integra ting f(z) term by term along L (why is this justified?), we see that only 
the term containing the coefficient b, _ ; survives (see Sec. 10.33a), 1.e., 


HL flee (2) ie 


“an LP(z) =F LZ—Zo 
=b . ji so” - eeener sol” 
a i (kK-1)!, Py(z) Jems, (4-1)! P(z) = - , 


with the help of Theorem 10.35b. It follows that 
L 


Figure 69 


l = k-)(Kk-1) 
Roi) =Re py ‘ om = |. 6) 


10.43. a. To evaluate the integral (1) in the case where the contour Z contains 
several, say m, singular points of the function f(z), we use the fact that 


l ] 1 
sip M2) dem gh fle) dest sb fle) de 


= Res f(z), (4) 


j=ilz=z; 


where each L; = D is a piecewise smooth closed path surrounding z; but no other 


singular points. The “residue theorem” (4) is an immediate consequence of 
Cauchy’s theorem and the construction given in Figure 69, which shows the 
original path L, the paths Lj, ..., L,, (for the case m = 3) and appropriate arcs 


joining these paths, where each arc is traversed twice in opposite directions. 


b. Example. Evaluate the integral 


dz 
Or ree (p>1). 


Solution. The denominator has two simple zeros, whose product equals 1. One 
zero 


z,=p-— J. ae 
lies inside the unit circle |z| = 1, while the other zero 


z,=p+,/p?-1 


Figure 70 


lies outside the unit circle (see Figure 70). It follows from formula (3) that 


‘ l Tt 
— ea... ~ pat 


10.44. Logarithmic residues. Consider the integral 


a l S'(z) 
- mba . (5) 


where L is a piecewise smooth closed path with no multiple points and D is the 
interior of L, while f(z) is analytic on L Y D and does not vanish on L. If f(z) is 
also nonvanishing on D, then (5) vanishes by Cauchy’s theorem. If f(z) has zeros 
inside L (i.e., in D), then by the uniqueness theorem for analytic functions 
(Theorem 10.39a), there are only a finite number of such zeros (why?). Let these 
zeros be denoted by 72), ..., z,, and their orders by ky, ..., k,,. Then we have the 


following 


THEorEM. The integral N, called the logarithmic residue of f(z) with respect to 
the contour L,{} is a positive integer equal to the sum of the orders of all the 
zeros of the function f(z) inside L. 


Proof. As in Sec. 10.43, 


N=¥.N, 
» : a “Pe 


where L; surrounds z; but no other singular points. For brevity, let = z;=¢, k, =k. 
Then 


f(z) =a,(z—C)¥ +a, (2-0) ++ (a, #0), 
f'(z) =ha,(z—C)** + (K+ Maga 1 (z-0)* +, 
f'(z)_ 1 kat (h+ Vays (z-O+--- _ (2) 


f(z) 2-0 Hy (z-0) + aa 


in a neighborhood of the point ¢. Clearly the function @(z) is analytic at the point 
z=(, where it takes the value @(C) = 4, and hence 


l f(z), 14 oz), _ - 
Oni St, fz) —~ ac ss) hae 


It follows that 


N=YNj=¥k. 0 
j=1 j=1 


10.45. Laurent series. We now turn to the “qualitative analysis” of the behavior 
of a function f(z) near an isolated singular point Zp (i.e., a singular point zp) with a 


neighborhood containing no other singular points), using as our basic tool an 
expansion of f(z) in a series involving both positive and negative powers of z — 
Zo: 


a. THEOREM. Every function f(z) analytic on an annulus 
K= {z: 19 S|z— 29 S Ro} 


can be expanded in a convergent “two-sidedseries” of theformt 
fl2)= Y ay(2-20)" (6) 
on K, known as the Laurent series of f(z) on K. 


Proof. The annulus or ring-shaped domain K is shown in Figure 71. Let r and R 
be numbers such that ry <r < R < Ro. Then 


flz)= xp Pgs LS) a, 


I¢-zol =r 6-2 OU J ignnginy & = 


by Cauchy’s theorem applied to the simply connected domain obtained by 
cutting the annulus r < |C — Zo < R along a radius which does not go through 


Zz 


Figure 71 
the point z. Making the expansion 


C—Z 
in the first integral and the expansion 
a ee ee ne 
C—z (€—Z9) —(z—2p) Z—2y,_$-Zo n=0(Z—Z)"" 
Z—Z 


in the second integral, where, by Weierstrass’ test (Theorem 6.53), both 
expansions are uniformly convergent in ¢ on the corresponding contours of 
integration. It follows from Theorem 10.22a that 


)= Ya a,(2—Z9)"+ > a,(2— 29)", (7) 


where the “Laurent coefficients” a, are given by 


x £0 


Qni imsojan(S—20)" 


I ig) , 
mai soj=r(6—20)"" dt if n<0. 


dt ifn>0, 


€—Zo) 


Note that since the integrand 


f(D) 


({—z,)"*? 


is analytic on the annulus K, we can evaluate these integrals along any circle | z — 
a| =p, ry <p < Ro, or, for that matter, along any piecewise smooth closed path L 


= K with no multiple points. In particular, if 79 = 0, the coefficient a_, is just the 
residue of f(z) at z = zp. Finally, we can write the expansion (7) in the form (6), 


using the fact that each of the “one-sided series” in (7) converges separately (see 
Theorem 6.48b). 


b. According to Theorem 10.22b, the Laurent series (6) converges uniformly 
inside the annulus K. In particular, the series (6) converges uniformly on every 
closed annulus 7 S| z — zo| SR, where 79 Sr SR S Ry. 

The Laurent series (6) is uniquely determined by the function f(z), as we see 
from (8) or equivalently from the easily verified formula 


} gl tiie 2 a, (z—Zo)""™ dz=2nta,,_1; 
|z—zol =p 


(Z— Zo) n=—@ |z-zol=p 
where 79 <p <R, m=O, 1, 42, x 


c. The first series on the right in (7), i.e., the series 
f s (z) _ 2 an(2— Zo)", 
is called the regular part of the Laurent series (6), while the second series 


f= ¥ a,(z—-z9)" 


is called the principal part of (6). The regular part of the Laurent series is a 
series in positive powers of z — Zp, convergent for | z — zo| = R and hence 
convergent on the whole disk |z — zo| S R (see Theorem 6.65a) including the 
whole “inner region” |z — Zo| S ro excluded from the annulus K. On the other 


hand, the principal part of the Laurent series is a series in negative powers of z — 
Zo, convergent for |z — zp| = r. Setting 


l 


i 


we get a series in positive powers of C converging for |¢) = 1/r and hence 
converging for all |¢/ S 1/r. Thus the principal part of the Laurent series 
converges for all |z — zo| 2 r including the whole “outer region” |z — zo| 2 Ro 


excluded from the annulus K. 


10.46. Classification of isolated singular points. Let w = f(z) be a function 
analytic at every point of a disk |z — zo| S Rp except at the center zy. Then f(z) can 


be expanded in a Laurent series 


fi)= ¥ a,(z—29)" (9) 


a=—o 


on the “punctured disk” 0 < |z — zp| “ Ro. There are now essentially three 


kinds of structure of the series (9), leading to three designations of the isolated 
singular point Zo: 


a. If all the Laurent coefficients a, with n < 0 vanish, we call z) a removable 


singular point of f(z). In this case, the function f(z) has a power series expansion 
of the form 


on the domain 0 < |z — zp| < Ro, whose sum is a function analytic on the whole 
disk |z — zo| < Ro including the point zp at which it takes the value ay. Hence, if 
we merely set f(Zp) = dp, the original function f(z) becomes analytic on the whole 
disk |z — zo| < Ro, with no singular point at zy. This explains the term “removable 
singular point.” 


b. If all the Laurent coefficients a, with n <— m (m 7 0) vanish and if A_» 1S 
nonzero, the point Zp is called a pole of order m of f(z). 


c. If there are nonzero Laurent coefficients with negative indices of arbitrarily 
large absolute value, we call zg an essential singular point of f(z). 


10.47. The point at infinity 


a. In the interest of both generality and simplicity, we now “extend” the complex 


z-plane by adding an extra “point” z = «, called the point at infinity. By z > «© 
we mean the direction in the z-plane consisting of the sets A,.= {z: |z| > rt, where 


r varies over all positive numbers, or at least over all numbers larger than some 
number 7 > 0 (cf. Sec. 4.73f). The complex w- plane is extended in the same 


way by introducing the “point” w = o0 and the direction w — o. The points of 
the original z- and w-planes (as opposed to the points z = 00 and w = 00) are now 
said to be finite, and the set of all finite numbers together with the point at 
infinity (the “number” oo) is called the extended complex plane. Given any 
direction S' in the z-plane and any function w = f(z) defined on the sets of S, we 
say that f(z) approaches infinity in the direction S and write 


lim f(z) =00 
Ss 
if, given any r~ 0, there exists a set 4. © § such that 


\Az)| 7 r 
for all z © A. (cf. Theorem 4.34a). This, together with the definition of the 


direction z — ©, explains the meaning of the expressions 


lim f(z) = 00, lim f(z) =a, lim f(z) = 00 


z~Zo za z7 0 
(where the points zg and a are finite). Thus, for example, 
1 = ‘ 
(a) lim , eo fe’) (m > Ie 2s = 
z>Zo (z ee Zo) 


(b) lim «(z) f(z) = 00 jf lim a(z) =a #0 lim f(z) = 0 ;, 


z~Zo z—Z0 


(c) lim f(z) = 00 if and only if lim 


= pti 2 


(cf. Theorems 4.36b and 4.37a). 


=0 


b. The point w = « is said to be a /imit point of f(z) in the direction S if, given 
any r 7 0 and any 4,© S, there exists a point z, © 4, such that 


Ke.) 7 


(cf. Sec. 4.45). If f(z) is unbounded in the direction S (cf. Sec. 4.32), then 
obviously f(z) has w = « as a limit point in the direction S, and conversely. 


c. Suppose f(z) is analytic on the domain G = {z: |z| 7 r}. Then, by Theorem 
10.45a, f(z) can be represented on G as a Laurent series 


t(z) = ¥ a2" (10) 


The substitution z = 1/¢ carries G into the domain H ={¢: 0 < |Q < 1/R} and fz) 
into the function 


g(C)= b> ao”. (10’) 


We say that the original function f(z) has a removable singular point, pole, or 
essential singular point at z = « if the new function g(¢) has the same kind of 
singular point at ¢ = 0. In other words, the point z = © 1s a removable singular 
point of f(z) 1f all the Laurent coefficients a, figuring in (10) with n > 0 vanish, a 
pole of order m if the coefficients a,, + 1, Gm +2, --- vanish but a, # 0, and an 


essential singular point if there are nonzero Laurent coefficients with arbitrarily 
large positive indices. 


10.48. Behavior of a function at a pole. Suppose f(z) has a pole of order m at 
the point z = Zo, so that the Laurent expansion of f(z) at z = Zp is of the form 


flzy= ¥ a(z—z9)", 


g=2=-)F 


where a,, = 0 ifn S— m (m7 0) and a_,, # 0. Writing 


O(Z) = ( — 2)” AZ) = Q_ py + Ogg 4 1 — 2q) + os 


we see that g(z) has a removable singular point at z = z) and hence becomes 
analytic at z = zy if we set @(Zp) = a__,,. It follows that 


_ f(z) 
MO Ga aay 
where 
lim f(z) = 00 


(Sec. 10.47a) since (zo) # 0. This is the kind of behavior exhibited by the 
integrands in Examples 10.22a and 10.22b. 


10.49. Behavior of a function at an essential singular point 


a. THEOREM. [f the function f(z) has an essential singular point at z = Zo, then f(z) is 
unbounded as z — Zp, 1.€., f(z) has the number A = ~ as a limit point as z — Zp. 


Proof. \f the function f(z) were bounded as z — Zp, then its principal part f (z) 
would also be bounded as z — Zp, since its regular part f"(z) is obviously 
bounded as z — Zp. The series 


-1 
f- (a= Ze a,(Z— Zo)" 

converges for all z # zp (see Sec. 10.45c). Hence, replacing z — zy by 1/¢, we get a 
series 


o= Yat 


which converges in the whole ¢-plane. If f (z) were bounded in some deleted 
neighborhood 0 “ |z — zo| < ¢, then g(Q) would be bounded for |] 7 1/e. Since 


(2) is certainly bounded for |{] 2 1/e, by its continuity (Theorem 5.16b), g() 
would be bounded on the whole ¢-plane. But then g(¢) = const, by Liouville’s 
theorem (Theorem 10.38) and hence g(¢) = 0 since g(0) = 0. This implies a,, = 0 
for all n < 0, contrary to hypothesis, thereby proving the theorem by contradictio1 


b. tHEore. [f f(z) has an essential singular point at z = Zq then f(z) has every 
number A in the extended complex plane as a limit point as z — Zo. 


Proof. We have just seen that A = 0 1s a limit point of f(z) as z — Zp. Given any 
finite complex number A, suppose f(z) does not have 4 as a limit point as z — Zo. 
Then f(z) does not take the value A in some neighborhood of Zp, and hence the 
function 


] 
p(z) ~ F(z)—A 


has no singular points in this neighborhood except at the point zg itself. If zp is a 
pole or essential singular point of g(¢), then, as shown above, there exists a 


sequence of points z, — Zp such that g(z,,) — % and hence f(z,,) — A, so that A is 
a limit point of f(z) as z — Zo, contrary to hypothesis. If zp) is a removable 
singular point of g(z), then the limit 

lim g(z) =a 

z2~Zo 

exists and is finite (cf. Sec. 10.46a). But there exists a sequence z, — Zg such 
that f(z,,) — 00, and hence 


] 


jaj-a" 


a= lim g(z) = lim 


It follows that 
Q(Z) = (2 — 29)” WZ) 


for some m 2 1, where W(Zo) # 0. The function 1/y(z) is analytic at z = zp) and 
hence has a Taylor series expansion 

l 
W(z) 


at Zz = Zo, where by = 1/y(zo) 0. But then 


=b)+6,(z—Z 9) +-:- 


l ] 
ee, a eo es 
Kam At oy 4+ ee) 


=A+(z—2Z9) [bo +41 (z—Zp) + -*], 


so that the Laurent expansion of f(z) has only aiinite number of terms in its 
principal part, contrary to the assumption that zy is an essential singular point of 


f(z). Thus, in any event, the assumption that f(z) fails to have any given finite 
complex number A as a limit point as z — Zp leads to a contradiction. 


c. If f(z) has an isolated singular point at z = zo, then obviously there are just 


three possibilities: 
(1) Az) is bounded in every neighborhood of zp; 


(2) f(z) approaches the limit 00 as z — Zp; 
(3) fiz) has at least two limit points as z — Zp, one of which equals 0. On the 
other hand, the following three possibilities are also mutually exclusive and 


exhaustive: 

(1') Zp is a removable singular point of f(z); 

(2') zy 1s a pole of f(z); 

(3') Zp 1s an essential singular point of f(z). 

It has already been shown that 1’ implies 1, 2’ implies 2, and 3’ implies 3. But 
then the converse assertions, i.e., that 1 implies 1’, 2 implies 2’, and 3 implies 3’, 
are also valid, since 1, say, must imply one of the three possibilities 1’, 2’, and 3’, 
but certainly not 2’ or 3’! It should be emphasized that these assertions are 
nontrivial. For example, the assertion that 1 implies 1’ means that a function 
which is analytic and bounded in a neighborhood of a point zg actually 


approaches a finite limit as z — zp. This can only be deduced by examining the 


behavior of a function near a pole (Sec. 10.48) and near an essential singular 
point (Sec. 10.49). 


d. Using Sec. 10.47c, we can transcribe all these results to the case of an isolated 
singular point at infinity. In particular, if f(z) is analytic for all finite oo and if f(z) 
has a unique limit point 4 = « as z > o, then f(z) is a (nonconstant) polynomial. 
Put somewhat differently, if an entire function f(z) is not a polynomial,y then f(z) 
has every number in the extended complex plane as a limit point as z — 0. 


10.5. Mappings and Elementary Functions 


10.51. Conformal mapping. Consider the /inear function 
w=f (z)=a(z— Zp) +w,; (1) 
where Zo, Wo, and a # 0 are given complex numbers. Obviously f(Z)) = Wo and /’ 
(Zo) = 0. Moreover w — Wo = a(z — Zg), So that 

|w —Wo| =|a||z— Zo| (2) 
and 

arg (Ww —wWo) =arg at+arg (Zz—Zp) (3) 


(to within an integral multiple of 27). According to (2) and (3), the mapping w = 
fiz) has the following geometrical meaning: Every disk |z — zo| Sr is expanded |al| 
times in all directions, then rotated about its center Zp through the angle arg a, 
andfinally shifted until its center lies at the point Wo. 


Now let w = f{z) be any function analytic on the disk |z — zo| Sr such that 


fZo) = Wo,f' (Zo) = a # 0. 


Then, by the very definition of the derivative, 
WwW =W_ +a(Z— Zp) +&(Z,Z9)(Z—Zo), (4) 


where é(z, Z)) > 0 as z — Zp. According to (4), the mapping w = f(z) 1s described 
by formula (1) to within “quantities of the second order of smallness,” and 
hence, to this accuracy, the mapping is the result of |a|-fold expansion, rotation 
through the angle arg a, and shifting to a new center at the point wo. In 


particular, the mapping carries two curves through the point zy) making the angle 
a with each other into two curves through the point wo making the same angle a 


with each other. A mapping of this kind is said to be conformal (1.e., “angle- 
preserving”) at z = Zp. 


10.52. a. Let w = f(z) be a function defined on a set EF, and let F be the set of all 
“images” of the points of £, 1.e., the set 


F= {w: w=flz),z© E}. 


Then, given any point w © F, every point z © E such that f(z) = w is called an 
inverse image of w. 


b. tHrorem. [f the function w = f(z) is analytic at z = Zp and if 


KZ) = Wof'(Zo) = 4 #0, 


then zy has a neighborhood U (in the z-plane) and wo has a neighborhood V (in 
the w-plane) such that every point w © V has a unique inverse image z © U. 


Proof. The theorem is obvious in the case of the function (1), where we can 
solve the equation directly for z, but a proof is needed in the general case. It 
follows from Theorem 10.39a that the zeros of the analytic function f(z) — wo 


must be isolated (why?). Hence there is a closed disk |z — zo| S ¢ on which f(z) — 
Wo vanishes only once, namely at the point z = zp. Then, by Theorem 10.44, 


l f'(2) 


ac dz=l, 
2m jz] =e f(z) —Wo 


since the only zero of f(z) — Wo inside or on the circle |z| = ¢ is of order 1. The 
function 


N(w) = Es ae z 

2nt J =e f(z) -—w 
is defined for w = wo (where it equals 1) and for neighboring values of w, say for 
lw — Wo| <6, since the denominator is nonvanishing on the circle |z| = ¢ not only 
for w = Wo but also for all values of w sufficiently close to wp. Moreover, M(w) is 
analytic and hence certainly continuous for |w — wo| < 6, by Theorem 10.27. But 


N(w) takes only integral values (Theorem 10.44) and hence M(w) must equal 1 
for |w — Wo < 5. By Theorem 10.44 again, this implies that the function f(z) — w 


vanishes only once inside the circle |z| = ¢, 1.e., for any w in the neighborhood V 
= {w: |w— wo] < 6} there is one and only one point z in the neighborhood U = {z: 


lz —zo| S¢} such that {z)=w. IT 


c. THEOREM. Let f(z), U, and V be the same as in the preceding theorem. Then the 
function f(z) is one-to-one on some neighborhood of Zo. 


Proof. Some points of the set (U) = {w: w = fiz), z © U} may fail to lie in V. 
However, f(z) is continuous at z = Zo, and hence there is another neighborhood 
U* = U the point Zq such that the set ((U*) = {w: w = f(z), z © U*} is contained 
in V. But, by the preceding theorem, every w © ((U*) has a unique inverse image 
in U, and this inverse image can only be the point z © U* such that fz)=w. I 


10.53. rHeorem. Let f(z), U*, and V be the same as in the preceding theorem, and 
let z = g(w) be the inverse of the function w = f(z) on U*, i.e., the function 
assigning to each point w © f(U*) the unique point z © U* such that fiz) = w. 
Then p(w) is differentiable at w = wo = f(Zo), with derivative 


9 (wo) == 

£0)’ 
Proof. The function g(w) is continuous at w = Wp. In fact, given any sufficiently 
small neighborhood U = {z: |z — Zo| < ¢} of the point Zo, the same argument as in 


the proof of Theorem 10.52b shows that there is a corresponding neighborhood 
V= {w: |w— wo < 8} of the point Wo Such that the inverse image of every point 


w © V belongs to U. But this is precisely what is meant by saying that y(w) is 


continuous at w = Wo. 
Now consider the “difference quotient” 


y(w) — (wo) (5) 


wW—Wo 


As just shown, z = g(w) — Zo = g(Wo), and hence (cf. Theorem 7.16) the limit of 
(5) as w > Wo exists and equals 


dite aie POET iy a Se 
. w-+ Wo wWw—Wo z+2zp U— Wo li S(z) —f(Zo) 
Z—Zo z~z9 Z—Zo 
f'(Zo) 


10.54. a. A function w = f(z) is said to be univalent on a domain G if it is one-to- 
one and analytic on G. According to Theorem 10.52c, if f(z) is analytic at z = zp 


and f(z) # 0, theny f(z) is univalent on some neighborhood of zp. Moreover, as 
we will see later (Sec. 10.59b), if fz) is univalent on a domain G, then /(z) # 0 
for all z © G. 


b. The linear function (1) is obviously univalent on the whole z-plane, while the 
function 


w= - (6) 


is univalent on the whole z-plane minus the point z = 0.7 


10.55. By a fractional linear function we mean any mapping of the form 


w= ot (be #ad, c #0). (7) 
Writing (7) as 
a. bc—ad 
= ooo F ie 
i * c(cz+d)’ 7" 


we see that (7) is the result of applying first a linear function of the form (1), 
then the function (6), and finally another linear function. But (1) is a one-to-one 
mapping of the extended z-plane onto the extended w-plane, carrying the point z 


= oo into the point w = «, while (6) is a one-to-one mapping of the extended z- 
plane onto the extended w-plane, carrying the point z = 0 into the point w = « 
and the point z = 00 into the point w = 0.7 Hence the fractional linear function (7) 
is itself a one-to-one mapping of the extended z-plane into the extended w-plane, 
carrying the point z = — d/c into the point w = © and the point z = © into the 
point w = a/c. 


10.56. Let 


where n is an integer greater than 1. Then 
|w| = |z|",arg w =n arg z, 


so that the function (8) is univalent on the “wedge” or “angular domain” 


—a<arg z<o (a<7/n) (9) 


which it maps in a one-to-one fashion onto the wedge 


— na arg w <na. 


Clearly the largest wedge of the form (9) on which the function (8) remains 
univalent is the wedge 


—n/n<arg z<n/n, (9’) 


which is mapped by (8) onto the domain W, consisting of all points of the w- 
plane except those on the negative real axis. 
The inverse of the function (8) is denoted by 


z=i/w (10) 


and called the nth root of w as in the case of positive real w, for which it reduces 
to the familiar function of elementary analysis (Sec. 1.63). The function (10) is 
univalent on the domain W, which it maps onto the wedge (9’). 

According to Sec. 10.51, the mapping (8) is conformal at every point z # 0, 


since the derivative w’ = nz" ~ | is nonzero if z # 0. However, (8) fails to be 


conformal at the point z = 0, where w’ vanishes. In fact, the mapping obviously 
produces an n-fold enlargement of angles between rays intersecting at z = 0. 


10.57. Next consider the exponential 


w=é, (11) 
Writing w =u + iv, z=x + iy, we have 
utiv=e*¥=e (cosytisiny) 


(Sec. 8.63), so that in particular 
\e*| =e, arg & =y+2kn (K=0,+1,+2,...). (12) 


The function (11) maps the horizontal line y = yg in the z-plane into the curve 
Uu = e* COS Vo,V = e* SIN Yo, 
i.e., the ray drawn from the origin of the w-plane making the angle yg with the 


positive u-axis. By the same token, the strip 


—a<y<d (a <2) (13) 


in the z-plane is mapped by (11) in a one-to-one fashion onto the wedge 


—a Sargw <a 


in the w-plane. The largest strip of the form (13) on which the function (11) 
remains univalent is the strip 

—n<y<n, (13") 
which is mapped by (11) onto the same domain W as in Sec. 10.56. 

The inverse of the function (11) is denoted by 

z=Inw (14) 
and called the /ogarithm as in the case of positive real w, for which it reduces to 
the familiar function of elementary analysis (Sec. 5.4). The function (14) is 
univalent on the domain W, which it maps onto the strip (13’). Solving (12) for x 
and y, we get 
x = In |e’|, y = arge e”, 
and hence 

In w=z=x+y=In |e| +i arg e =In |w| +2 arg w (—x<argw<n). 

(15) 


Formula (15) allows us to take real and imaginary parts of In w. Moreover, 
“inverting” the formula 


e717 72 = @71 @72 
(Theorem 8.64), we get the corresponding property of complex logarithms: 
In w,W> = In w, + In wp. 

To differentiate the logarithm, we use Theorem 10.53 and the formula 
(2) =e 


proved in Sec. 10.37b, obtaining 


It follows from Theorem 10.25b that 


“dw 
Inw=Inwo+ | — 
wo @ 


for every w © W, where the path of integration is any piecewise smooth curve L 
W joining wo to w. The simplest choice is wg = 1. Then In wo = 0 and we have 


(< 
Inw=| —, 
1 @ 


Thus the logarithm has now been defined for all complex w with the exception 
of negative real w. Later we will enlarge the domain of definition of In w to 
include negative values of w as well (see Example 10.59d). 


10.58. We now use the formula 
pa) 


to define an arbitrary positive real power of z, assuming that z is not a negative 
real number. Consider the wedge 


—a<arg z<a (a<min {z,2/A}) (16) 


or equivalently 


—a<Im In z<a, (17) 
Multiplication of (17) by 4 gives 

—Aa<Im (A In z) <a, (18) 
which corresponds to 
— ja arg w* a. 
where 

w =z", (19) 


The largest wedge of the form (16) on which the function (19) remains univalent 
is the wedge 


—n/iA<arg z<n/A, (16’) 


which is mapped by (19) onto the same domain W as in Secs. 10.56 and 10.57. 
Using the usual rules, we find that the derivative of (19) is just 


(gh MF)! ang! ™2() In z)'= A xine 2 ,A-1 
Zz 


10.59. Riemann surfaces 


a. The function 

w=(Z—Zp)" (n> 1) (20) 
is univalent on the wedge 
—a/n Sarg (z— Zo) <n/n 
(cf. Sec. 10.56) and analytic on the disk K = {z: |z — zo| <r}, but not univalent on 
K. In fact, (20) maps the circle L = {z: |z — zp] =p <r} onto the circle L = {w: w 


= p"} in the w-plane, but as the point z makes one circuit around L, the 
corresponding point w makes n circuits around A. 


b. The situation is similar near a zero Zp of order n of any analytic function 


w= flz) = a,(z — 29)" + dy 41 (Z—29)"* 1 +++, 40,07 1). 


In fact, let U be a neighborhood of zg containing no other zeros of f(z) and f(z), 
and let L — U be a piecewise smooth closed path surrounding zy). By Theorem 
10.44, the logarithmic residue of f(z) with respect to L is just 


Le? 8 o's 
2m LJ (2) 


But then 


2. fs ee 
a ee 


for all w ¥ 0 with sufficiently small absolute value, by the same argument as in 
the proof of Theorem 10.52b. It follows, again by Theorem 10.44, that f(z) takes 
any such value w precisely n times in U, with due regard for order. But /(z) does 
not vanish in U except at the point z = zg itself, and hence f(z) takes the value w 


at n distinct points of U. 

In cases like that just considered, the mapping can be made one-to-one by 
introducing the notion of a Riemann surface. We will not give a general 
definition of a Riemann surface here, confining ourselves instead to a discussion 
of two typical examples and some subsequent remarks of a qualitative nature. + 


c. Example. To construct the Riemann surface for the function w = z” (n > 1) we 
start with n “samples” Do, Dj, .... D,, — ; of the ordinary w-plane, all “cut along 
the negative real axis” (1.e., with the points of the negative real axis deleted), 
regarding the point w = 0 as the same for all “samples” or “sheets.” The sheets 
are then “pasted together” (i.e., the corresponding points of the sheets are then 
identified) in the following fashion: The lower edge of the cut on the first sheet 
Dp is pasted to the upper edge of the cut on the second sheet D, the lower edge 
of the cut on the second sheet D, is pasted to the upper edge of the cut on the 
third sheet D, and so on, until finally the lower edge of the cut on the nth sheet 
D,, — | 18 pasted to the upper edge of the cut on the first sheet Dp, as shown in 
Figure 72 (for the case n = 4).{ At the same time, each sheet is regarded as 
equipped with its original real and imaginary axes. 

The function w = z” can now be regarded as a one-to-one mapping of the 
whole z-plane onto the Riemann surface just constructed. In fact, suppose the 
point z traverses the circle |z| = p in the counterclockwise direction, starting from 
the initial position z = p 7 0. Then the corresponding point w traverses the circle 


\w| = p” on the first sheet of the Riemann surface until the point z reaches the 
boundary of the “domain of univalence” —a/n < arg z < n/n (cf. Sec. 10.56), 
corresponding to the value arg z = n/n. Then arg w = 7, so that w lies on the cut 
along which the first sheet is pasted to the second sheet. As z traverses the circle 
lz| = p further in the same direction, the point w moves onto the second sheet, 
then onto the third sheet, and so on, finally returning to its original position on 
the first sheet after z has made one complete circuit around the circle |z| = p. 
These considerations show that the function w = z” is in fact a one-to-one 
mapping of the z-plane onto the Riemann surface shown in Figure 72. The 
inverse function, which we continue to denote by z=%/w, is defined on the same 
Riemann surface, which it maps back onto the whole z-plane in a one-to-one 
fashion. 


Figure72 


Any n distinct points of the Riemann surface with the same real and imaginary 
parts but on different sheets of the surface correspond to n distinct points Zo, Z,, 


5 Z, — 1 Of the z-plane, namely the points with absolute value a/|wl and 
argumentst 


arg =~ (arg w + 2kr) (k=0,1,...,.2—1). 


d. Example. Next we construct the Riemann surface for the function w = e’, 


starting from a (countably) infinite number of samples D, (k = 0, £1, +2, ...) of 
the w-plane, all cut along the negative real axis. The sheets are pasted together in 
the same way as in the preceding example, i.e., they are all joined together at the 
point w = 0 and for each & the lower edge of the cut on the sheet D; is pasted to 
the upper edge of the cut on the sheet D,. ; (see Figure 73). 


The function w = e’ can now be regarded as a one-to-one mapping of the 
whole z-plane onto this Riemann surface. In fact, suppose the point z moves 
upward along the vertical line x = xo, starting from the initial position z = xq 7 0. 


Then the corresponding point w traces out the circle |w| = e*° on the first sheet of 
the Riemann surface until z reaches the boundary of the “domain of univalence” 
—n Sy <n (cf. Sec. 10.57), corresponding to the value y = 2. Then arg w =, so 
that w lies on the cut along which the first sheet (Do in the figure) is pasted to the 
second sheet. As z moves further up the line x = x9, the point w moves onto the 
second sheet in the figure), then onto the third sheet (D,), and so on indefinitely. 
(As an exercise, describe what happens as the point x moves downward along 
the line x = x9.) 


The inverse function, which we continue to denote by z = In w, is defined on 
the same Riemann surface, which it maps onto the whole z-plane in a one-to-one 
fashion, with the value of w on the nth sheet corresponding to the value of z in 
the strip 


Figure73 


(2n — 1)n Sy S(2n + 1a. 


On the other hand, the inverse of the function w = e’, regarded as a function 
defined on the whole w-plane (rather than on the Riemann surface), is a 
“multiple-valued” function (Sec. 2.84), in fact an “infinite-valued” function, 
which we denote by Ln w. The values of Ln w at the point w are given by the 
formula 


Ln w = In |w| + i arg w, 


where arg w varies over all its possible values (differing by integral multiples of 
2n). In particular, for a negative real number w = —p (p 7 0), we have 


Ln (— p) = In |p| + (24 + 1)ni(k = 0, +1, +2, ...), 


where there is no natural reason to prefer one of these infinitely many values of 
the logarithm to any other. 


e. Riemann surfaces can be constructed for other analytic functions in much the 
same way. To construct a Riemann surface for a given function w = f(z), we first 
prepare a suitable number of identical sheets, equipped with cuts making them 
into “images” of domains of univalence of f(z). We then paste the edges of the 
cuts on the different sheets together (just as was done in the above examples), 
thereby guaranteeing the one-to-one character of the resulting mapping of the 
whole z-plane onto the whole Riemann surface for w. 

Suppose Z 1s a point of the z-plane with the property that every circuit around 


a circle of arbitrarily small radius centered at z) causes the function w = f(z) to 
take a new value. Then Zp is called a branch point of f(z).+ Branch points play an 


important role in determining the structure of a Riemann surface. Given any 
point Zz) which is not a branch point of f(z), there is a neighborhood of zg in 


which f(z) reduces to a family of singlevalued analytic functions. No such 
neighborhood exists for a branch point. 


Problems 


1. Given a series 


fle)= Yaz" (1) 


with radius of convergence 1, suppose the coefficients a, are all positive. Prove 
that z = | is a singular point of f(z). 

2. Given a series (1) with radius of convergence 1, suppose the only singular 
point of f(z) on the circle |z| = 1 is a pole zp of order m. Prove that |a,| <An™~!, 


where 4 is a constant. 
3. (Maximum modulus principle). Let f(z) be analytic on a domain G bounded by 
a contour L, and let 


a If(2)I- 


Prove that |f(z)| <M for all z © G. Prove that if |{(zp)| = M at some (interior) point 
zo © G, then f(z) is constant on G. 


4. (Schwarz’slemma). Let fiz) be analytic on the disk K = {z: |z| < R}, where f(0) 
= 0 and |f{(z)| for all z © K. Prove that 


Ml< FIA 
for all z © K. 


5. Let f(z) be analytic on the disk K = {z: |z| < R}, and let 
M(r) =max | f(z)| (r<R). 


|z|=r 


Prove that In M (r) is a convex function of In 7. (Hadamard) 
6. A function u(x, y) of two real variables x and y is said to be differentiable at 
the point (x, y) = (Xo, Vo) if there exist numbers 4,, and B,, such that 


u(xo +h, Yo + k) > u(Xo, Yo) + Ah + B, k+ e(h, k) 
in a neighborhood of (Xo, vo), where 

(hk) 
J+ 


as h — 0, k — 0. Prove that the complex function w = u(x, y) + iv(x, y) 1s 
differentiable at z = z, if and only if u(x, y) and v(x, y) are differentiable at the 


point (x, y) = (Xo, Vo) and A, = B,, A, =— B 
7. Prove that the fractional linear function 


ue 


watz 
~ ez+d 


carries the family of all circles (including straight lines, regarded as circles of 
infinite radius) into itself. 
8. The “growth” of the function 


flz)= 5 a2" 


can be estimated by using the functions 


M (r) =max | f(z)| 


jzj=r 


and 
My(r)= ¥ lal’ 


where obviously M(r) S M, (r). Prove that 


My (1) <2 mir) 


for every 6 7 0, so that “M 1(7) does not grow too rapidly compared to M(r).” 


9. (Argument principle). Let fiz) be analytic on a piecewise smooth simple 
closed contour L and its interior G, except possibly for poles in G, and let f(z) be 
nonzero on L. Prove that the number of circuits around the point w = 0 made by 
the point w = f(z) as z traverses L once in the counterclockwise direction equals 
the number of zeros of f(z) in G minus the number of poles. 

10. Let f,(z) be a sequence of analytic functions which converges uniformly 
inside a domain G (see Sec. 10.36a) to a function f(z) ¥ 0, and let Z be the set of 
all zeros of all the functions f,(z) in G. Prove that the set of all zeros of the 


function f(z) in G (with a zero of order m counted as m zeros) coincides with the 
set of all limit points of Z in G. 
11. Prove that if the series 


ae) 
n=0 


converges at even one point of analyticity of the function g(z), then g(z) is entire 
and the series converges uniformly on every disk. (Polya) 


12. Construct the Riemann surface, find the branch points, and describe the 
domains of univalence of the function 


| | 
w= 5 € + =) (2) 
and its inverse 
z=wt,/w?-1. (2’) 


Do the same for the function 


w=CcoOs Zz (3) 


and its inverse 
z=Arc cos w. (3') 


13. Prove that i Arc cosw =Ln(w+,/w? —1).. 

14. Prove that if the functions f, (z), ..., f,(z) are all analytic on a domain G, then 
f,(2)| +... + |f,(z)| cannot have a maximum in G. 

15. Suppose f(z) is analytic on the whole z-plane which it maps into a subset of 
the upper half-plane. Prove that f(z) is a constant. 

16. Derive the fundamental theorem of algebra (Theorem 5.85c) from 
Liouville’s theorem (Sec. 10.38). 

17. Derive the fundamental theorem of algebra from Theorem 10.44. 

18. Derive the fundamental theorem of algebra from the maximum modulus 
principle (Problem 3). 

19. (Partial fraction expansions). Let f(z) be analytic on the whole z-plane 
except at the points 2), ..., Z,, .... Where f(z) has simple poles with residues by, ..., 
b,, ... Suppose that for some C the set {z: |fz)| SC} contains a family of circles 
I’, centered at z = 0 with radii R,, — ©. Prove that 


i=g, Z, 


fiz) =f10) + ¥5,(—— +=). 4) 


20. Unfinite product expansions). Let g(z) be analytic on the whole z-plane, with 
simple zeros at the points 21, ..., Z,, ..., and suppose the function 


satisfies the conditions of the preceding problem. Prove that 


g(2) =e Tf (1— Zee, (5) 


Zz, 
21. Verify the expansions 
P = z* 
sin z=2]]{(1 = 7) 
e 22 


22. (Phragmén-Lindelof theorem). Suppose f(z) is a function analytic on a 
domain 


G= {z:-a Sargz Sa < 7/2} 
such that 
ifz)| < Act 


for all z © G, while |f(z)| S 1 on the rays arg z = + a. Prove that the inequality 
\Az)| <1 also holds for all z © G. 
23. Suppose f(z) is a function analytic on the upper half-plane G such that 


lfx)| SC (— 2% Sx So), 
while, given any ¢ 7 0, 
Ifa)| SA, 


for all z © G and some A, Prove that |f(z)| € C for all z©G. 
24. Suppose f(z) is an entire function such that 


lAx)| SC, inf \fx)| = 0(— © S x <0), 
while, given any ¢ 7 0, 
|Az)| SA, eF!, 


for all z and some A,. Prove that f(z) = 0. 


+ More concisely, Cz can be called the “z-plane” and C\, the “w-plane.” 

+ Here we use the term “domain” in a precise technical sense, as opposed to its loose meaning in the 
phrases “in the real domain” and “in the complex domain.” 

t As in the real case, a function f(z) is said to be differentiable on a set E if it is differentiable at every point 
of E. 

+ To be perfectly explicit, u(x, y) denotes the real function of two real variables x and y such that u(x, y) = 
u(x + iy) for all x + iy © E, and similarly for v(x, y). 

+ See, e.g., R. A. Silverman, /ntroductory Complex Analysis, Dover Publications, Inc., N.Y. (1972), 
Theorem 13.2, p. 275. 

+ Let P= P(t), aSt Sb be a variable point of L. Then, as on p. 363, (1) reduces to the ordinary integral 


[fe'o at 


if f(t) = f(P() is continuous and g(f) = g(P(A) has a continuous derivative. 

+ In other words, suppose f(z, ¢) is continuous on L x A (see Sec. 5.18). It then follows from Theorem 5.17b 
that f(z, ¢) is uniformly continuous on L x A. 

+ The fact that zg can be joined to z by a path L entirely contained in G follows from the connectedness of G 
(see Sec. 10.1 1e). 

+ Note that this is consistent with the definition of a simple closed polygonal line, given in the footnote on 
p. 287. 

t A plane set £ is said to be convex if whenever £ contains two points zj and z9, E also contains the line 
segment joining and z, and z9. 

+ There is a missing detail here, which the reader should prove as an exercise, namely that Lyy (like L itself) 
is contained in G if the segments of Lyq are sufficiently small. 

+ A closed path L is said to “surround” a set E (or a point zg) if L has no multiple points and if E (or zo) 
belongs to the interior of L. 


+ For example, it follows from Theorem 10.22b or from a familiar property of power series (Theorem 
6.65a) that the series (8) converges uniformly inside the disk K. 


+ Recall the definition of a compact set (Sec. 3.91a). 

+ In particular, zg is a singular point of f(z) if f(z) fails to be defined at z = zg or fails to be differentiable at z 
= Z(Q)- 

+ The argument leading to (4) is the natural generalization of that given in Sec. 10.33b in connection with 
Figure 68. 

+ As shown in Sec. 10.37, the integrand of (5) is just the derivative of the composite function In f(z). 

+ It will be recalled that (6) is just the limit of the sum 


¥ a,(2—Z9)" 
ne==p 


as p and q approach infinity independently (see Sec. 6.48a). We often call (6) the Laurent (series) expansion 
of fiz) at z =z. 

+ There is a more exact theorem due to Picard, which asserts that f(z) actually takes every finite value (with 
one possible exception) infinitely many times in every neighborhood of an essential singular point. See e.g., 
A. I. Markushevich, Theory of Functions of a Complex Variable, Vol. II (translated by R. A. Silverman), 
Chelsea Publishing Co., N. Y. (1977), Theorem 9.16, p. 343. 


+ By Liouville’s theorem (Theorem 10.38), the polynomial reduces to a constant if f(z) is bounded. 

+ The angle between two intersecting curves (with tangents) is defined as the angle between their tangents 
at the point of intersection. 

+ Note that the mapping (6) is conformal at every finite point z # 0 (see Sec. 10.51). 

+ More exactly, if w is the linear function (1), then w — o as z > o, while if w is the function (6), then w 
— oasz— 0 and w— 0asz— (see Sec. 10.47a). 


+ Readers interested in pursuing the study of Riemann surfaces are referred to the abundant literature on the 
subject. See e.g., Markushevich Theory of Functions of a Complex Variable, Vol. U1, Chapter 7 and G. 
Springer, Introduction to Riemann Surfaces, Addison-Wesley Publishing Co., Inc., Reading, Mass. (1957). 


{ The fact that this “pasting together” cannot actually be accomplished in three-dimensional space without 
further “‘self-intersections” does not matter. 


+ Here we restrict arg zy and arg w to their principal values, 1.e., to their unique values in the halfopen 
interval (—1, 7]. 

+ As an exercise, show that the points z = 0 and z = © are branch points (and the only ones) of the functions 
w=¥Z and w=Lnz. 


1 1 Improper Integrals 


11.1. Improper Integrals of the First Kind 


11.11. Let y= J) be a complex-valued function of a real variable x, defined on 
the interval a Sx <b < ©. Then, in keeping with Sec. 10.21, the integral of f(x) 


over the interval [a, b] is defined as r flx) de= [ u(x) desis v(x) dx, 


where u(x) is the real part and v(x) the imaginary part of f(x). Thus f(x) is 

integrable on [a, b] if and only if u(x) and v(x) are both integrable on [a, 5]. 
Suppose f(x) is integrable (being piecewise continuous, say) on every finite 

interval a S x S X < «, where a is fixed and X variable, and consider the 


(complex) function 
x 

(x)= | fo) ae (1) 

defined for all X 2 a. Suppose I(X) approaches a finite (complex) limit J as X > 

00, Then the expression 

or (2) 


called an improper integral of the first kind, is said to be convergent, with value 
I. In other words, we set 


a Xa X~o Ja 


L f(x) de=J= lim I(X) = lim f f(x) dx, 


by definition. On the other hand, if JX) does not approach a finite limit as XY — 
oo, we call the integral (2) divergent and assign it no value at all. 


Examples 


a. The improper integral 


[a 
1 


is convergent for a <— 1, with value — (a + 1) !, and divergent for 0 2-1, since 


yore.) 
x i 
[fa a Seen 
: In X ee 


b. The improper integrals 


@ i 
| cos x dx, | é* dx 


0 0 


are divergent, since both expressions 


x x. Fa 
| cos x dx=sin X, | e* dx= —. 


0 0 U 


fail to approach a limit as XY > 0. 


11.12. rHeorem. Let I(X) be given by (1). Then the following four assertions are 
equivalent: (a) Given any ¢ 7 0, there exists a number Xo > a such that 


x 
|\I—I(X)| -|r- | S (x) dx 
for all X 2 Xq and some number I; (b) The sequence I(X,,) has a finite limit for 
every sequence X,, — ©; 


<eé 


(c) The series 


Xn+1 
xX 


E aed —LKDI= ¥ | Ps) a 


converges for every sequence X,, — ©; 


> 


(d) Given any ¢ 7 0, there exists a number Xo a such that 


ux) —1x =| [ fe) a 


<€é 


for all X' 2 Xo, X" 2 Xo (the Cauchy convergence criterion for improper 
integrals). 


Proof. Assertion a is simply the definition of what is meant by the function /(X) 
having a limit as XY — , while Assertion d is the Cauchy convergence criterion 
for the existence of this limit (Theorem 4.19) and hence is equivalent to 
Assertion a. Assertion b is the equivalent of Assertion a in the language of 
sequences (see Theorem 4.65), while Assertion c merely expresses the usual 
connection between convergence of a series and convergence of the sequence of 
partial sums of the series (see Sec. 6.1 1a). 

Note that in Assertion b it is essential that we allow any sequence X,, > ©, 
since the convergence of the improper integral (2) is not implied by the 
convergence of the sequence /(X,,) just for some sequence X,, — 00. For example, 


2nn 
| cos x dx=0 (n=1,2,...), 
0 


but the integral 


ie #] 
| cos x dx 
0 


is divergent (Example 11.11b). 


11.13. a. If the integrand f(x) of the improper integral (2) is nonnegative, then the 
function /(X) defined by (1) is nondecreasing. Then either /(X) is unbounded on 
the interval a SX <0, in which case I(X) > 0 as X > o (see Theorem 4.55) 
oc we say that (2) diverges to + 2 (Sec. 4.61), or else /(X) is bounded on a SX 


00, in which case 
lim/J(X)= sup /(X) (3) 


Xo azX<a 
(see Theorem 4.53) and the integral (1) is convergent, with value (3). 


b. rHEorem (Comparison test for improper integrals). Let f(x) and f,(x) be two 


nonnegative functions on [a, ©) integrable on every finite subinterval [a, X] 
[a, ©), and suppose f, (x) © cf; (x) for all x < Xo. Then convergence of the 


integral 


[A a (4) 
implies convergence of 


[fi ds (5) 


while divergence of (5) implies divergence of (4). 


Proof. An immediate consequence of Sec. 11.13a and the inequality 
x x 

1 (X) —1, (Xo) = \, Si (4) dv<e| fy(x) dx =c[1,(X) —1,(X9)], 
o 0 


valid for all X7 X). fl 


c. Example. Let P(x) and Q(x) be polynomials with complex coefficients of 


degrees p and q, respectively, where Q(x) has no zeros in the interval a <x So 
Then the integral 
® P(x) 
——* dx 6 
Mar 
of the rational function P(x)/Q(x) a if ¢ 2 p + 2 and diverges if g <p + 
1. In fact, if g 2 p + 2, then P(x)] - 
Q(%)|* 


for some constant A and all sufficiently large x. But then the convergence of (6) 
follows by using the function 1/x? in the comparison test (see Example 11.11a).+ 
On the other hand, suppose g Sp + 1. Then the rational function xP(x)/O(x) has 
a nonzero “polynomial part,” say P(x) _ P,(x) 7 

i, +b x+-+++5,x", 

Q(x) Q,(*) ania 


where n 2 0, b, #0, ae the degree of P, (x) is less than that of Q, (x). It follows 
that P(x) 1 P, (x) n-1 
ms) Be ~~ +5, $e $b, x" 4, 
Q (x) x Q(x) 


and hence 


“Pi a 71 Pix) x eo _— 
| Fh dem | LG det bons +6y(x )+-- +5, (X*—a"). 


As just shown, the first term on the right has a finite limit (the degree of the 
denominator is at least 2 greater than the degree of the numerator), but the sum 
of the remaining terms obviously has no limit as XY — ©, since its absolute value 
approaches infinity as Y — 00. Therefore (6) diverges if g Sp + 1, as asserted. 


d. Example. Let 


P(x) = by + byx +... + b,x” (b, > 0) be a polynomial with real coefficients. Then 


x. a eae. 
em bo pbx ter tbe" lm 


for sufficiently large x, say x 2 a. Hence, by Example 11.11a, the integral 
| v P(x) 
converges if n/m 1 and diverges if n/m © 1. 


11.14. Next we use improper integrals to prove a useful test (due to Cauchy) for 
the convergence of numerical series: rHrorem (Integral test). Given a numerical 
series 


ya, (7) 


with positive nonincreasing terms (a, . 1 © a,), let a(x) be a positive 
nonincreasing function such that a(n) = a,. Then convergence of the integral 


| : a(x) dx (8) 


Ji 


implies convergence of the series (7), while divergence of (8) implies divergence 


of (7). 


Proof. An immediate consequence of the fact that 
Qa,+---+4,< [ac dx Ja, +++ +4,_); 

which is in turn clear from Figure 74. fi 
Examples 

a. According to Example 11.1 1a, the integral 
[wo 


converges if a 7 1 and diverges if 0 <a S 1. Hence, by the integral test, the 


a . wo 
corresponding series y 1 
1 n* 

1= 


converges if a 7 1 and diverges if 0 <a S1 (see Example 6.15a). 


Qn 


Figure74 


b. The formula 
| aon | a (nln esiadd) 
| 


a x(In x)* oir 


shows that the improper integral on the left converges if a 7 1 and diverges if 0 
SoS 1. Hence the corresponding series € 
n=2n(In n)* 


converges if a 7 1 and diverges if 0 <a <1 (see Example 6.15c). 


c. Invoking geometrical considerations like those used to prove the integral test 
itself, we can deduce various estimates for the remainder of converging series 
and for partial sums of both convergent and divergent series. For example, 
= |  d& l ] 
Figure 75 shows at once that ee 37 = (A> 1), 
kank w-1 A—1 (n—1) 


while choosing a(x) = 1/x in Figure 74, we get 


ee . dx so 
-~<| —=lInn< ) -. 
uk | uk 


1* 
eA 
n—1l n n+l . 


Figure75 
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Figure76 


Similarly, it follows from Figure 76 that 
n k+1 

pe ja | a(x) ax} <a, 

k=1 k 


where the quantity on the left increases with n, approaching a limit C as n > o, 
so that 


Sam | als) d+ C~ ry 9) 


where y, “s 0 as n — ©. The value of the constant C for a, = 1/k, known as 


Euler’s constant, is often encountered in analysis. This constant, first calculated 
by Euler in 1734, equals C = 0.5772... Thus, choosing a; = 1/k in (9), we find 


l l 
Ona eet -+—mln (n+1) + 0.5772... —y, (7, \9). 


11.15. Next we establish the analogues for improper integrals of the formulas for 
integration by parts and integration by substitution. 


a. Let u(x) and v(x) be two functions defined on [a, ©), both piecewise smooth 
on every finite subinterval [a, x] — [a, ©). Then integration by parts gives 


x x x 
| u dv = uv -| udu, (10) 


as in Sec. 9.5la. Taking the limit as x — o in (10), we get 
| u dv=uv -| vdu, (10’) 


provided that at least two of the limits in (10) exist (in which case the other 

exists automatically). Here, of course, the “integrated term” in (10") is just 
«0 x 

= lim a = lim [u(X)o(X) —u(a)v(a)]. 


a Xa a X->o 


uu 


b. Similarly, suppose u(t) has a continuous derivative on every finite subinterval 
[a, €] — [a, ©), and suppose fw) is continuous on the interval {u : u = u(t), a St 
< co}. Then 


x é 
| fis) dum | ‘flu(t))w'(t) de, (11) 


as in Sec. 9.53, where u(a) = a, u(é) = x. Suppose é — © implies x > b (b Sov), 
Then, taking the limit as € —~ o in (11), we get 


b re 
| fe iu= | fln(t)) u(t) at (11’) 


provided that at least one of the integrals in (11) exists (in which case the other 
exists automatically). 


11.2. Convergence of Improper Integrals 


11.21. Consider the improper integral 
is #] 
[Als ds (1) 
where f(x) is a complex function which is piecewise continuous on every finite 
subinterval [a, X]  [a, ©). 


a. THEOREM. /f the integral 
| | f(x)| dx (2) 


converges, then so does the integral (1). 


Proof. An immediate consequence of the Cauchy convergence criterion for 
improper integrals (see Theorem 11.12) and _ the inequality 


x’ xX 
[fs as] < [1st as 


: xX 


valid for allx’2a,x"2a. I 


b. Definition. An improper integral (1) is said to be absolutely convergent if the 
integral (2) converges.+ It may turn out that the integral (1) converges while the 
integral (2) diverges (an example will be given below). We then say that (1) is 
conditionally convergent, just as in the analogous situation for series (see Sec. 
6.22). 


11.22. Next we consider convergence tests for nonabsolutely convergent 
improper integrals. 


THEOREM (Leibniz’s test for improper integrals). Suppose f(x) is real and has 


infinitely many Zeros Qj, A, ..., Ay, ... in the interval [a, ©), where a, < ay <...<8 
a, <. Mags. — ©asn 0 —. Suppose fix) 7 0 if Ary] <x< Arp» while fix) <0 
if Ayn xX Say, + | (see Figure 77), and let 

Figure77 


bw \" “flx) dex 


Suppose further that 


\b,| 2 |b, + 1 = N, N+ 1, ...) and |b,| > 0 as n > ©. Then the integral (1) 
converges. 


> 


Proof. Given any x ~ a, let n be such that a, S x a, + 1. Then 


[fe aes if Aix) dep [B, +o4b,04]4 [fe de: 3) 


The first term on the right is constant, while the term in brackets approaches a 
(finite) limit as n — o, by Leibniz’s test for series (Theorem 6.23). The last term 


does not exceed - | f(x)| de=|b,| 


in absolute value, and hence approaches zero as n — . It follows that the left- 
hand side of (3) approaches a finite limit as X > 0. ff 


Examples 


a. Consider the improper integral 
@ 
| g(x)sin x dx, (4) 


where g(x) 7 0 and g(x) “0 as x > o. The consecutive zeros of the function 
g(x)sin x are at the points x = nz, (n + 1)n, ... (where nz 2 a). Since, obviously, 
g(x)|sin x| 2 g(x + z)|sin (x + 2)|, we have 

(n+1)x (n+ 1)x 
al ("Gin | de> [2 gee+n)[sin (e-+n)] ds 

nn nn 

(n+2)x 


= [" eG)Isin al de=|by 
(n+1)x 

Moreover, 
(nt1)r 

|6,| = | g(x)|sin x| dx<ng(nn)—+0 


as n — oo. Thus the conditions of Leibniz’s test are satisfied, and the integral (4) 
converges. 
If 


(2) dx < 00, 


the integral (4) is absolutely convergent. On the other hand, if 
© 

| g(x) dx=00, 

then the integral 


|; e@)lsin x| de (5) 


a 
diverges. In fact, the inequalities 


g(x) >g(nz) 


: l 
|sin x] >-, 
2 


hold in the interval 
(n—I)n+2 <x<nn—F, 


and hence 


nz — (1/6) 


leew Gil aa x| dx> | g(x)|sin x| Pe es 2m 
(n-1 (n—1)x+(x/6) 43 


But the series with general term g(nz) diverges, by the integral test (Theorem 
11.14), a iti se l an vids tunel, 


where the integral on the right diverges. Therefore the integral (5) also diverges, 
by Theorem 11.12. 


b. The substitution x’ = u transforms the integral 


| ‘ sin (x’) dx (y>0) (6) 


a 
into the integral 
] 


es] 
~| u@/)~! sin u du, 
Y Jay 


which, as just shown, converges if (I/y) -1 < 0, ie., if y 7 1.+ Note that the 
integrand sin (x”) of the convergent integral (6) does not approach zero as x > 
oo, as might be expected on the basis of a formal analogy with the case of series. 
In fact, the analogue of the assertion “the general term of a convergent numerical 
series approaches zero” is not the assertion “the integrand of a convergent 
improper integral approaches zero as x — ©,” which, as we see, is false, but 
rather the assertion “the integral of the integrand of a convergent improper 
integral taken over an interval of fixed length approaches zero as the interval 
moves off to infinity.” In other words, lim | ee far 
x70 Jx 


for any fixed h 7 0, as follows at once from the Cauchy convergence criterion 
for improper integrals (see Theorem 11.12). 


11.23. The Abel-Dirichlet test for improper integrals. The Abel-Dirichlet test 
gives conditions under which an improper integral of the form 


[fede dx (7) 


converges. First we consider the general case where the functions f(x) and g(x) 
are both complex: 


a.tEoreM. Let f(x) be piecewise smooth on every finite subinterval [a, X]—[a, ~), 
with derivative f(x), and suppose f(x) — 0 as x — while f'(x) is absolutely 
integrate on [a,~©). Moreover, let g(x) be piecewise continuous on every finite 
subinterval [a, X]—[a, ©), and suppose |G(x)| SC (aS < «), where 


o(s)= ["e@ i pares 


and C is independent of x € [a, ©). Then the integral (7) converges. 


Proof. Integrating by parts between finite limits p and g, we get 


| “flxdg(2) de=fl2)G(2)|' — [ows dx (a<p<q). (8) 
p Pp oP 


The first term on the right satisfies the inequality 


S(*)G(x)} =| S(9)G(q) —f(P)G(p)| <2C max {| f()|,1/(9) |}; 


q 
P 
and hence approaches zero as p, gq since f(x)—0 as x00, while the second 
term satisfies the inequality | “7 (x) f(x) ay <C | "1 f'(x)| de, 

p P 


and hence also approaches zero as p, g—>~, by the absolute integrability of f'(x) 
on [a,©). Therefore the left-hand side of (8) approaches zero as p, gq, 1.e., the 
integral (7) satisfies the Cauchy convergence criterion and hence converges. 


b. If f(x) is real, we can modify the conditions on f(x) somewhat, obtaining the 
following version of the Abel-Dirichlet test: rHeorem.}Let the real function f(x) 
be piecewise smooth on every finite subinterval [a, X]—[a, ©), and suppose f(x) 
‘0 (or Z 0) as x — ©. Moreover, let the (complex) function g(x) be the same as 
in the preceding theorem. Then the integral (7) converges. 


Proof. Under these conditions, f'(x) does not change sign, and hence 


=|f(X) —f(a)| 


X xX 
| ra) éx=|( f(a) de 


has a finite limit as YX ©, 1.e., f'(x) is absolutely integrable on [a,0), as before. 


c. Example. Let P(x) and Q(x) be two polynomials in x with complex 
coefficients, and let a be any nonzero real number. Consider the integral 
{ PCR) tax dx, (9) 
a Q(x) 
assuming that O(x) has no zeros in the interval a <x <0, and that the degree of 
O(x) exceeds that of P(x) by at least 1. Writing Sidon 5 “ 

x 


we find that 


ria) — PQ (x) = P(x) Q (2) 
a 


is a rational function with a denominator whose degree exceeds that of its 
numerator by at least 2. Therefore f '(x) is absolutely integrable on [a,«), by 
Example 11.13c. Moreover, the function g(x) = e'™ 


has the antiderivative 


eltx 


G(x) aa or a 

10 
which is bounded on [a,©). Applying Theorem 11.23a, we see at once that (9) 
converges. 


11.3. Improper Integrals of the Second and Third Kinds 


11.31. Improper integrals of the second kind 


a. Let f(x) be a complex function defined on a finite interval [a, 5] and integrable 

on every subinterval [ate, b]-[a, b], but possibly not integrable (e.g., 

unbounded) on the whole interval [a, b].¢ Suppose the function 
b 


I(e)= L fle) de 


é 


approaches a finite limit J as ¢O. Then the — expression 


b 
| fees (1) 


called an improper integral of the second kind, is said to be convergent, with 


value I. In other words, we set 
b b 
| J (x) dx =lim I(e) =lim | SI (x) dx, (2) 
a e\.0 e\.0 Jat+e 
by definition. On the other hand, if I(e) does not approach a finite limit as ¢ ‘0. 
we call the integral (1) divergent and assign it no value at all. 


b.rneorem. If the integral (1) exists as an ordinary integral with value I, then it 
exists in the sense of the definition (2) and has the same value I. 


Proof. If (1) exists in the ordinary sense, then the real and imaginary parts of dc 
are bounded (by Theorem 11.15c), and hence f(x) is bounded, so that |f(x)| S 


> fate 
say. Then, given any ¢~ 0, = [fe x) dx= f(x) dx+ | fix) dx, 
a Ja gate 


where 


Cate 


| filx) dx| <Ce. 


Therefore 


b ate 
im | f(x) dx=I—lim | f(x) dx=I. § 
e\.0 Jate es,0 Ja 

c. The above definition of an improper integral of the second kind closely 
resembles the definition of an improper integral of the first kind (1.e., with an 
infinite upper limit), given in Sec. 11.11. In fact, an improper integral of the 
second kind can be immediately transformed into an improper integral of the 


first kind by making the substitution ,__, — * 


u 
Hence, starting from the theory of improper integrals of the first kind, we can 
deduce a whole theory of improper integrals of the second kind, formulating 
analogues of all the theorems of Secs. 11.1 and 11.2.7 Thus, at this point, we can 
confine ourselves to a few subsidiary comments. 


d. The above considerations have obvious analogues for the case of a function 
fix) defined on a finite interval [a, b] and integrable on every subinterval [a, b 
~¢] = [a, b], but possibly not integrable on the whole interval [a, b]. In this case, 
the appropriate improper integral (of the second kind) is defined as 
| SI (x) ds=tim | f(x) dx 


e\.0 Ja 


(give further details). 


e. As always, to apply the comparison test we need suitable “standard integrals.” 
In the case of improper integrals of the second kind, the most commonly used 


standard integral is 
b> dk 
| (x—a)*" (3) 
Since 
] ] am? 
dy Fear Kcr eae 
hah ifA=1, 
€ 


we see that (3) converges if 2 © 1 and diverges if 4 2 1. Applying the 
comparison test, we find that if __% _ < f(x) 2 
(x—a)* (x—a)* 


for alla <x S bo < b then the integral (1) converges if 2 < 1 and diverges if 4 2 
1. 


f. In particular, consider the integral 
b 
| \In x|” dx (y>0). (4) 
0 


For small x, we have 
\In x| <cx™* 


for every a 7 0 (cf. Theorem 5.56). Choosing a = 1/2y and comparing (4) with 
the convergent integral ee 1/2 dy 


0 
we find that (4) converges for all y 7 0. 


11.32. Improper integrals of the third kind 


a. Let f(x) be a complex function defined on an interval [a, b], where the end 
points may be infinite, 1.e., where the values a = —«, b = +o are allowed.} Then 
by a singular pointt of f(x) we mean (a) The point a if a = —0o; 

(b) The point b if b = +00; 

(c) Any point c © (a, b) such that f(x) fails to be integrable in the ordinary sense 
in every neighborhood of c; (d) The point a if a is finite and f(x) fails to be 


integrable in every one-sided neighborhood a S x < a + ¢; (e) The point b if b is 
finite and f(x) fails to be integrable in every one-sided neighborhood b — ¢ <x S 
b. 


b. Suppose f(x) has no more than finitely many singular points in [a, b], and let 
f(x) be continuous or piecewise continuous on every set obtained by deleting 
neighborhoods of these points from the interval [a, b]. Choosing a point p; 


between every pair of consecutive singular points c,, c; , , of f(x), we get a set of 
intervals [c;, p;], [P;, ¢; + 1], each containing only one singular point (as an end 


point). The integral over each of these intervals is defined as ve appropriate 
improper integral of the first or second kind, i.e., sie a) tem tien ” Ax) de 


XNCi IX 


\ f(x) de= lim [ fx) de. 


Pi XA7ci4i JPi 
Suppose all these improper integrals converge. Then the integral 


b 
| 70) as 
called an improper integral of the third kind, 1s said to be convergent, with value 


| orm Y} i" f(x) det | “flx) ax, (5) 


a Pi 


where the sum is over all the singular points c; of f(x). 

We must still verify that (5) does not depend on the choice of the points p;. It 
is clearly enough to do this for one interval [c;, c; 4 ,]. Let c; © p; <q; < ¢; + 1. 
Then as asserted.+ 


fad 


[70 dx+ | f(x) de= lim \" + lim [ 


Pt XNci YAci+1 JPi 
Pi qi Y 
= lim | + lim ‘| + | 
XNci JX Y/Sci+1 Pi i 
Pi qi Y 
= lim | a4 | ! + lim | 
XNci (IX Pi Y¥/ci+1 J4i 


ect [+ lim { - [fs a ee ds, 


XNcy JX YAci+1 qi 


c. Example. Consider the integral 


d. 
| ee (KE ++ <0 <0) (6) 
~o ee] —G,]” 


with singular points —o, c), ..., c,, 0. The integrals 
Pi ce 
| : | 
ct Pi-1 
(again we omit the integrand) converge if and only if a ; < 1, since the factors 


other than __! 
ol" 
are bounded in a neighborhood of the point c;. Moreover, the integrals 


converge if and only if a, + -:: + a, 7 1. Note that the integral | ed 


im emel™? 
to which (6) reduces for n = 1, diverges for every a, since we cannot have both a 
<1 and a7 1. The same is true of both integrals | c & | = ae 
~ w (¢—x)*’ e (x—c)* 


11.4. Evaluation of Improper Integrals by Residues 


11.41. Integrals of rational functions 


a. Consider the improper integral 


2 P(x) he , 
acka v) 


where the integrand is a rational function, i.e., the quotient P(x)/Q(x) of two 
polynomials (with complex coefficients). Just as in Example 11.13c, the integral 
(1) converges if the denominator has no real zeros and if the degree of the 
denominator is at least 2 greater than the degree of the numerator. We now 
examine the problem of actually evaluating (1). Of course, it follows from the 
natural generalization of Theorem 9.33 to the case of convergent improper 
integrals that (1) equals the quantity G(oo) -G(—0oo)= lim [G(b)—G(a)], 


a-— © 


b+ aw 
where G(x) is the indefinite integral of P(x)/O(x), which can be found by the 


technique of Sec. 9.42. However, as we now show, (1) can often be evaluated 
much more quickly by taking advantage of the analyticity of the function 
P(x)/Q(x), or, more exactly, of the fact that the function P(z)/Q(z) of the complex 
variable z is analytic at every point of the z-plane except at the finitely many 
zeros of the denominator Q (z). 

Thus let Lg be the (piecewise smooth) closed path in the upper half-plane 
made up of the interval [—R, R] of the real axis and the semicircle 
Cx ={z=Re*, 0<0<nz}, (2) 
where Lz and Cf are traversed in the counterclockwise direction and R is so 
large that all the zeros of the denominator Q (z) in the upper half-plane, say z,, 


..., Z, lie inside Lg (see Figure 78). It follows from formula (4), p. 407, that 


3 R P(x) TE ogee P(z) 
¢Q(2) * -| a0) * Jog tay 2 BS ©) 
Using the condition on the degrees of P (z) and QO (z), we see that the inequality 
P(z)|__ A 
Q@|<® 


holds on Cg for some constant A and all sufficiently large R. Therefore 
J 


Figure 78 
ia f She. 
ae - O(n 7=% 
since 
P(z) , 
li Q(z) a S Ree 


and hence 


Lf Pl) R P(x) P(2) 
poe Bis elim | op ~ iY RZ: 


But since the integral (1) converges, it must coincide with 


soy Fae 
lim | Oe) 
so that, finally, 
© P(x) P(z) 
o nC eles ani 5 BAGG (4) 


b. If the zeros z,, ..., z, are all simple, then 


P(z)  P(z;) 
Reso) 2)’ (9) 


by formula (3), p. 406, and hence (4) becomes 


= P(x) P(z,) 
| “a py 2)’ 6) 


For example, 


" ee ee 
a 2z =i . 


c. If the zeros Z), ..., z, are multiple, with orders k,, ..., k, respectively, then (5) 
must be replaced by the more general formula (3’), p. 407, 


P(z)_ 'P(z) 4-9 | 
Reso) @ =)! | e-ai" aol. - ) 
so that 
© P(x) eee q l _ x, P(z) (kj-1) F 
okie tos mil (4 2) ‘ToL. “a ©) 


For example, 


eee anil (2—i)* a | = lon I. 
= (x* +1)? (2*+1)* Jems (z+i)* Je: 
_ —4ni —_—" l _t 
(z+i)? 8° 2 


d. We have just shown that the improper integral (1) equals 2zi times the sum of 
the residues of the function P(z)/Q(z) in the upper half-plane, by evaluating the 
integral of P(z)/O(z) along the contourLgmade up of the segment [—R, R] of the 
real axis and the semicircle (2). But we could just as well have carried out a 
similar calculation, using the contour 4x made up of the interval [—R, R], this 


time traversed from right to left, and the — semicircle 
Cr ={z=Re®, n<0<27}_ _ (2’) 
in the /ower half-plane (2x and Cr are again traversed in the counterclockwise 
direction). This gives 
P(z) P(x) P(z) P(z) 
dx + dz=2n ) Res —— i 
00"), Oo **|,aae "Zsa 


instead of (3), er z},---»2¢ are now the zeros of the denominator Q(z) in the 
lower half-plane. Taking the limit as R —- o, we get 


Pi) a (PP) P(2z) 
Lae tim | Dia) Hay Ree 
since 


ff Plz) 
ai acu 


as before, and hence 


(4’) 


© P(x) P(z) 
(78 dx= — -2ni ee 0)’ 


Note that (4’), unlike (4), has a minus sign in the right-hand side. Since the left- 
hand sides of (4) and (4’) must coincide, it follows as once that the sum of the 
residues of the function P(z)/Q(z) at all the zeros of Q(z), in both the upper and 
lower half-planes, must vanish. 

The italicized assertion can easily be proved directly. In fact, the sum of the 
residues of P(z)/O(z) at all the zeros of Q(z) must equal the integral 

P(z) 

ace) dz (7) 

along the full circle |z| = R, provided that the radius R is large enough so that all 


the zeros of Q(z) lie inside the circle. But (7) does not depend on R (for 
sufficiently large R), while at the same time (7) satisfies the estimate 


P(z) dzl< a 
oer) 


as R — oo. Hence the integral (7) vanishes, so that 
P(z) 

Reet) es 
ur zj a +2 iz= =2z,°Q (z ) 


as asserted. 


A 
2nR< Rzcrhk0 


l= 


=0, 


11.42. Fourier integrals 


a. By a Fourier integral we mean one of the frequently encountered integrals 


[foyer ae 8) 
ie Avon de (9) 
| td f@eincx de, (10) 


involving a real parameter o. If the condition 
[Gl dr<eo 


is satisfied, then all three integrals (8)-(10) are absolutely convergent. If the 
function f(x) is real and approaches zero monotonically as |x| — o, then the 
integrals (9) and (10) converge for o # 0 (in general nonabsolutely), by Example 
11.22a. Suppose (9) and (10) converge. Then (10) vanishes if f(x) 1s an even 
function, 1.e., if {(—x) = f(x), while (9) vanishes if f(x) is an odd function, i.e., if 
fi-x) = —fix). Moreover, obviously 
[° S(x)é™* dx= [ Sf (x)cos ox dx+t [° J (x) sin ox dx, 


and hence the integrals (9) and (10) are the real and imaginary parts of (8) if f(x) 
is real. 


b. We now show how contour integration can be used to calculate Fourier 
integrals of rational functions. Let f(x) = P(x) 

Q (x) 
be a rational function, 1.e., a quotient of polynomials, and suppose the 
denominator Q(x) has no real zeros and is of degree at least 1 greater than the 
degree of the numerator P(x). Then the integrals (8)—(10) converge, by Example 
11.23c. As before, let z)..., z, be the zeros of Q(x) in the upper half-plane, and let 


elt dz 


LR be the contour consisting of the interval [—R, R] of the real axis and the 
semicircle (2). Then, by formula (4), p. 407, 
P(2) , ~: Pie. P(z) 
¢™ dz= —— '* dx+ 
f, Q(z) | ae cz Q(z) 
P (z)é*? 
=2ni )) Res———, (11) 
» z=7j Q(z) 
provided R is large enough so that Lg surrounds the points Fie Zp lO > 0, 
then ,. P(z) ez 
lim ———— ¢** dzm(). (12) 
R+0 he Q(z) 
This follows by the same argument as in Sec. 11.4la if the degree of Q(z) 
exceeds that of P(z) by at least 2, since |e!”| = |e!*~®|=e" S1 
if y2 0, o2 0. However, the argument given there does not work if the degree of 
Q(z) exceeds that of P(z) by only 1, and we must then resort to a new argument, 
based on the following two propositions: ¢. Lemma. If o 7 0, then 
n : A 
—Roasin®@ Pema 13 
[\. z, (13) 


where A is a constant. 


Proof. Since sin (z — 0) = sin 8, we need only consider the integral 
n/2 ? 

| e~ Resin@ d6, 
0 


equal to half the integral in (13). But 


n/2 : r/6 . n/2 : 
| e Re sin 6 dé= | e Re sin @ dO + | e Resin# de 
0 0 /6 


x/6 m/2 
| cos 0 27 Resin 0 i0-+ | e-Resin® ap 
o cos (12/6) n/6 
- in 0 
=. ene 4 © -Ro/2 
Rocos(r/6)\.j6 3 
l Tl -Ra/2 


1 scciceitihancaaciatnaii 
Ra cos(1/6) ’ 3 


l ] T A 
ee a =e —Ra/2 < ae 
Fa aa (1/6) - : | Ro’ 


since the function xe~ is bounded forx7 0. I 


d. corottary(Jordan’s lemma). Given a function fiz) defined for Im z 2 0, \z| 2 
Ro, suppose 
lim sup | f(z)| =0. (14) 


R-o |z|/=R 


Then 


lim | F(z)e'? dz=0 
cr 


R70 
if o 7 0, where C& is the path (2). 
Proof. It follows from (13) and (14) that 


le i illite 


| Ff (ze!*e~  Re!® dO 
i) 
<R sup | f(z)| aa do 
zeCit ) 
A 
<— sup | f(z)|0 
o zeCi 


asR—o. ff 


e. We can now complete the proof of formula (12) in the case where the degree 
of O(z) exceeds that of P(z) by only 1. Here fiz= P(z) 
Q(z)’ 


and hence the inequality 


A 
Al<s 


holds on Cg for some constant A and all sufficiently large R. But then (14) holds, 
and hence so does (12). Thus, finally, to get the Fourier integral of f(z) for o 7 0, 


we need only ea the limit as R — o in (11), obtaining 
a) P(x) P(z)e'* 
e'€* dx=2ni ¥ Res———_ (15) 
ae 2 z=z, Q(z) ’ 


since clearly 


= UN) fie = lim E P(x) viox dy. 
ae, d= tim | nQ(e) 


f. If o <0, the above argument breaks down, since e’* becomes arbitrarily large 
on the path Cg. In this case, we carry out an analogous construction in the lower 
half-plane rather than in the upper half-plane. Thus let Cx and Lr be the same as 
in Sec. 11.41d, and let R be large enough so that all the zeros 27,---»27 in the 


lower half-plane lie inside LR. Then 
rtey x. 7 Pig P(z) 
ioz dz= iox dx+ iaz d 
Lik Q(z)° : R Q(x)° cz Q(z)° . 
: P( z)é% 
=2n1) Res ———, 
», z=2z;°* Q(z) 
where the integral along Cr satisfies the estimate 
P(z) ; | (72. iox, - | 
eit? dz = OX, *yRe ié do 
\, Q(z) x Q(z) 
P(z) x eRasiné 
<R sup a sin d0, 
zeCR Q(z) 
and hence approaches 0 as R > , by Lemma 11.42c. It follows that 
* PS) _ P(z)e'*? 
——e'™ dx= —2n1 Res ———_— 15’ 
a ee Oe” we 


which gives the value of the Fourier integral of P(x)/O(x) for o < 0. Note that the 
left-hand side of (15) or (15’) fails to exist if o = 0 and if the degree of O(x) 
exceeds that of P(x) by only 1. 


g. Example. Using (15) and (15’), we find that 


ioz 


‘ ze ae ae 
2niRes =—-~ = ine~® ifo>0, 


- ge a sai 2 +1 
ot +1 =| zeit 


—2ni Res =—- = —ine’ ifao<0O. 
z=-i% T l 

The power of the method of contour integration is particularly apparent in this 

case, since the corresponding indefinite integral 


xele* es 
x7 +1 


cannot be expressed in terms of elementary functions. 


h. The Fourier integral (9) or (10) can sometimes exist even if f(x) has singular 
points on the x-axis, provided the behavior of f(x) at these points is compensated 
by appropriate zeros of the functions cos ox or sin ox. For example, consider the 
integral 
[- sin ox be (16) 

ee. 
of importance in mathematical physics.t To evaluate (16), let Ap be the contour 
shown in Figure 79, obtained by consecutively traversing the interval [—R, —e] of 
the real axis, a small semicircle C, = {z :z = «ee!’,0 SAS} 


in the clockwise direction, the interval [¢, R] of the real axis, and finally the 


—R —é é R 
Figure 79 


same large semicircle Cg as above, given by (2). Since the function e’”/z has no 


singular points inside or on Ap, we have 


ioz —e ,iex igz R iox ioz 
$ : i= | Face | : ter = aes | Scott e135 
An * =z * c, * e * cz 2 


(why?). Moreover 


R-0 JCR 

if o > 0, by Jordan’s lemma, and 
—et ,iax R ,iox 

| a | — dx 
-R x € 


since the integrals of the odd function 


COS Ox 
x 


over the intervals [—R, —e] and [e, R] cancel each other out. Furthermore 


loz d ie | 0. ..,i0 ; 
| —ae=| + : d= | —g = —int py 


Zz 7 
ioz 
| é ls, 
ct * 


for some constant A, since the function 


eit = 


em < Ae 


<max 
zeC, 


eit? = 


Zz 


is continuous at z = 0 and hence bounded near z = 0. Therefore, letting ¢ — 0, R 
— o in (17), we get i sin OX 1 on 


a. ae 


If o <0, then 

sin ox = —sin (—o)x, 

where —o * 0, and hence " sin Ox 1 {" sin (—o)x 
x 


— a 


Thus, finally, 


= Ox 1 n/2if o>0, 
me d= 


0 —n/2 if ¢<0, 


as found by Euler in 1781. 


11.5. Parameter-Dependent Improper Integrals 


11.51. The Fourier integral is an example of an improper integral containing a 
parameter (o in this case). More generally, consider an arbitrary improper 
integral (of the first kind, say) 


0(t) = ico ee (1) 


containing a real parameter ¢, where (1) is assumed to converge for all ¢ in some 
interval a St. We now ask the following questions about ®(f), regarded as a 
function of ¢, thereby generalizing the considerations of Sec. 9.11 to the case of 
improper integrals: (a) Under what conditions is @(¢) continuous (cf. Theorem 
9.111)? 

(b) When does the formula 


ry [ fx) a| co i" flxt) ax} dt (2) 


hold (cf. Theorem 9.112)? 
(c) When does the formula 


o'(t) = [Ale és (3) 


hold?+ 
To illustrate the import of these questions, consider the integral 


o)=| sin xt ie 


——— 


which converges for all real t. As shown in Sec. 11.42h, 


1/2 if t>0, 
a=) if <0, 


while obviously ®(0) = 0. Thus, despite the continuity of the integrand 


sin xt 


SF (x;t) =— : (4) 


in both variables x and ¢, the integral ®(7) is discontinuous at ¢t = 0. Hence 
continuity of the function f(x, 7) is certainly not enough to guarantee continuity 
of the function ®(t). Moreover, differentiating (4), we get f(x, 2) = cos xt, 


so that (3) fails to hold, since the integral on the right is divergent. Admittedly, 
formula (2) holds for the function (4) if a 7 0 (see Example 11.59b), but this is 
due to properties of (4) other than its mere continuity. In fact, Problem 6 gives 
an example where (2) fails to hold although the integrand satisfies the continuity 
requirements of Theorem 11.54. 

Thus it is clear that extra conditions must be imposed on the integral (1) if 
@(t) is to be continuous and if formulas (2) and (3) are to hold. As we will see 
below, such an extra condition is afforded by the requirement that the 
convergence of the improper integral be uniform. Although we will develop the 
theory only for improper integrals of the first kind, everything can easily be 
carried over to the case of improper integrals of the second and third kinds (do 
this as an exercise). 


11.52. Definition. Suppose the “parameter-dependent” integral (1) converges for 
all tf in some set M. Then (1) is said to converge uniformly on M, if, given any é 
> 0, there exists a number x) 7 a_ such _ that 


x 00 

o()- | F (x,t) ax | ST (x,t) dx 
a x 

for all Y2 Xo and all © M. 


Given any numerical sequence X,, 7 © (X, > a), consider the sequence of 
functions 


Xn 
0,()= | fei) de (nwel,2,...), (5) 


and suppose the integral (I) converges uniformly on M. Then clearly the 
sequence (5) converges uniformly on M to its limit, given by (1). 


<€é 


11.53.1rHEorEM. Given a metric space M, suppose f(x, t) is uniformly continuous 
onevery product space [a, X| x M (a < XS .), and suppose the integral (1) 
converges uniformly on M. Then the function ®(t) defined by (1) is continuios on 
M. 


Proof. By Theorem 9.111, each function ®,(f) defined by (5) is (uniformly) 


continuous on M. Hence @(f) is also continuous on M, being the limit of a 
uniformly convergent sequence of continuous functions (see Corollary 5.95b). 


11.54.tHeorem. Jf f(x, ¢t) is continuous on_ every _ rectangle 
axx<X, a<t<p (a<X<oo), 

and if the integral (1) converges uniformly on the interval a S t < B then (2) 
holds. 


Proof. Since the sequence of continuous functions @®,(t) defined by (5) 


converges uniformly to the function @®(f) defined by (1), it follows from 

Theorem 9.102 that 
B B 

im ®, (¢) a= | M(t) dt. (6) 


n~ a 


On the other hand, by Theorem 9.112, 


if ®,(t) dt= ce flx,t) ax} dime ‘. i" filx,t) a dx. 


Therefore we can write (6) in the form 


tim ‘i. [’ flx,t) a dx = [ow dt. (7) 


But, by Theorem 11.12, the existence of the limit on the left for arbitrary x, 
guarantees the existence of the improper integral 


(\ [ fx) a ds, (8) 


where, of course, the limit equals (8). Hence, recalling the definition of ®(t), we 
see that (7) is equivalent to (2). 


11.55. a.rueorem. Let f(x, t) and its partial derivative f(x, t)be continuous on 
every rectangle a Sx SX, aStS<p (a <.X < «), and suppose the 
integral 


. fla) ox 


converges while the integral 
| Sy (xt) dx 


converges uniformly on the interval a St SB. Then the function ®(t) defined by 
(1) exists and is differentiable on a St SB, with derivative (3). 


Proof. Consider the sequence of functions ®,() defined by (5). It follows from 
Theorem 9.114 that 


Kes 
O'(t) = | Ay (x,t) dx (n= 1,2,...). (9) 
By hypothesis, the sequence @,(¢) converges at the point t = a, while the 
sequence ®’(f) converges uniformly on the interval a S ¢ S f. Hence, by 


Theorem 9.106, the sequence ®,(f) converges uniformly on a S ¢ § £, and its 


limit function @(¢) is differentiable on a S ¢t S £, with derivative 
'(t) = lim /(t). 


n~o 


This proves the existence and differentiability of the integral 


(t) = lim ©,(1) = Is flat) dx 


and the validity of (3). 


b. The above theorem takes a somewhat different form for analytic functions. 
Suppose the function f(x, f) is defined for all x © [a,00) and all t © G, where G is 
some domain in the plane of the complex variable t = o + iz, and suppose the 
integral 

| SF (x,t) dx (10) 


converges for all t © G. Then (10) is said to converge uniformly inside G (cf. Sec. 
10.36a) if, given any compact set O-G and any ¢ 7 0, there exists a number X9 


= xo(Q, 67a such that | " fietrdles 
x 


for all XY X, and all t © Q (cf. Sec. 11.52). 


THEOREM. Let f(x, t) and its partial derivative f(x, t) be continuous on every 
“cylinder” 


aSxSX, |t-H|Sr (a < XS), where the disk |t - to| <r is contained in 
a domain G, and suppose fix, t) is analytic in t for every x © [a, ~) while the 
integral (10) converges uniformly inside G. Then the function ®(t) defined by (1) 
is analytic on G, with derivative (3). 


Proof. By Theorem 10.27, every function (5) is analytic on G, with derivative 
(9). Moreover, ®,(t) converges uniformly to @(7) inside G, and hence ®(f) is 
analytic on G, by Weierstrass’ theorem (Theorem 10.36b). But ®',(¢) converges 
uniformly “ @'(t) inside G, by the same theorem, and _ hence 


«© 


©'(t) = im | "£ (x,t) de= flxt) de. §j 


n> oo Ja 
11.56. Next we prove a test for uniform convergence of improper integrals 
similar to the Cauchy convergence criterion (see Theorem 11.12). 


THEOREM. The improper integral 


ico dx  (teM) (11) 


converges uniformly on the set M if and only if the following condition is 
satisfied: Given any ¢ 7 0, there exists a number Xo > a such that 


pee 
\. flxt) dx| <e (12) 


for all X', X" 2 X and all t© M. 


Proof. Suppose (11) is uniformly convergent on M. Then, given any ¢ 7 0, there 
is an X) 7 a such that 


’ 


' g 
| Ff (x,t) a <5 


for all X’ 2 Xo and all ¢ © M. Choosing any other X” 2 Xo, we also have 


\. S(x,t) dt 


was 
2 


for all ¢ © M. Therefore (12) holds for all X’, X” 2 Xo and all ¢ © M, since 
obviously 


-||" Fides i. flxt) ax 

we er 

< | fx) ax/+|[ flat) ax. 
P x 


L filx,t) dx 


Conversely, suppose (2) holds for all X’, X” 2 Xo and all ¢ © M. Then the 


integral (11) converges (for all © M) by Theorem 11.12. Taking the limit as _X” 
— oo in (12), we get 


<é 


{fos dx 
t 


for all X’ 2 Xo and all t © M, so that (11) is uniformly convergent on VM. & 


11.57. a. Let v(X) be a nonnegative function such that the integral 


0) dx 


converges. Then g(X) is called an integrable majorant of every (possibly 
complex) function f(x) such that |Ax)| S g(x). 
THEOREM. Suppose g(x)is an integrable majorant of the function fix, t) for all t © 


M. Then the integral ,. 
[fos as 


is uniformly convergent on M. 


Proof. An immediate consequence of Theorem 11.56 and the estimate 


x”’ 


[fe as|< | ” Lilet) de< | ols) ds 
a 


x’ ‘ 


(see also Theorem 11.12). 


b. Example. The improper integral 


“cos ax—cos Bx 
———— 
0 x 


dx (0<a<1,0<f<1) (13) 


is uniformly convergent on the interval 0 Sa <1 for any fixed f. To see this, we 
first prove the formula 
“a ifx>1, 


2 ifO0<x<l, 


cos ax—cos Bx 


—— 
x 


valid for a and f in the unit interval [0, 1]. The first inequality is obvious. 


To prove the second, we use the formula 
2.3 
cos yx = 1 — i +x%g(y,x) 


(see formula (7), p. 254), where 


+ 6 
A 2? l l 
ecrnl= [2 a nar a 
if, ly| < 1,[x/S1 which in turn implies 
- » ee - 
£08 008 Pal IPO + Leleux) —e(B.x)]| <5 +5 +5 <2 
x 2 : 22 


if, a, B, x © [0, 1]. Thus the function bhi 2/a* Pap, 
P12 if 0<x<l 


is an integrable majorant for the function 


cos ax —cos Bx 


———— 
x 


for all a, 6 © [0, 1]. Therefore the integral (13) is uniformly convergent on the 
interval 0 <@ $1 for any fixed a (or, for that matter, on the square 0 Sa <1, 0 
< B <1). Hence, by Theorem 11.53, (13) represents a continuous function of a 
on the interval 0 S a@ S 1. An explicit expression for this function will be given 
in Sec. 11.59b. 


11.58. Convolutions 


a. By the convolution of two complex functions f(t) and g(t), defined for all real 
t, we mean the integral 


mey= | fleets) ds (14) 


(an improper integral of the third kind). The function A() does not always exist. 
Conditions for the existence of A(f) are given by the following theorem, which at 
the same time establishes the properties of h(t): tHeorem. Jf f(t) and g(t) are 
bounded, continuous, and absolutely integrable on the real line -«~ ~ t™ «, then 
h(t) exists for all t. Moreover h(t) is itself bounded, continuous, and absolutely 
integrable on the real line, and __ satisfies the relation 


mo dt= ic dt "a dt. (15) 


Proof. Let |g(t)| SC. Then the integrand of (14) satisfies the inequality |f(x) g(t — 
x) | SC), 


so that the integral (14) is convergent, in fact uniformly convergent on the whole 
line -o S ¢ S + o, Moreover, A(t) is bounded, — since 


mL< | Liles) dec | 7) ae 


Next, with a view to proving the continuity of A(t), we show that the integrand 
of (14) is continuous in both arguments on every finite rectangle 
axx<b, ast<f. (16) 


—-2 «o 


Given any ¢ ~ 0, there exists a 6 7 0 such that x’, x” © [a, b], |x’ — x"| <6 implies 
fx’) — flx")| Se, while ’, t” © [a — b, B — al, | — t”| < 26 implies |g(t’) — g(t”) < 
é.+ But x’, x” © [a, b], |x’ — x"| <6, 1, t" © [a, B], | — t"| <6 implies ¢’ — x’, ” — x" 
< © fa - b, B - al, | -— x’ — (@” - x" S 26, and hence 
| F(x’) g(t’ — x’) —f(x") g(t” —x”)| 

<| f(x’) Se ")Ile —*’)1 +1 Se Dile(’ —*’) — g(t" —%")| 

<Ce+Ce=2Ce 


” 


if |{x)| SC, |g(x) S C|. Thus f(x)g(t — x) is continuous in both x and ¢ on every 
rectangle (16). It follows from Theorem 11.53 that h(¢) is a continuous function 
of ¢. 

We now verify the absolute integrability of A(f). First we note that the 
functions |f(t)| and |g(t)| have the same properties as f(t) and g(t) themselves, so 
that the integral | Lflx)lle(t—2)| dx 


=a 


is also uniformly convergent on the whole line -co < ¢ < 0, with an integrand 


which is continuous in both x and ¢ on every rectangle (16). Applying Theorem 


&weaet (" colar [1 (” islet] ash at 


- | ” 1 fle)| 


= 0c 


- |" fel} 


< [" Ifa) 


\e(t—x)| a dx 


I 
[ letol at} ds 
[le (| a ds 


from which the existence of the integral 
[7 moat 


follows at once. 
Finally, to prove (15), we again use Theorem 11.54, this time obtaining 


io in [, Ay flze(t—x) ash dt 
-|" 4 0} |" g(t—x) ath ds 
-\" 4 aie ~&ll a| ie 
=|" A fos)} (" g(t) dt— fo 8! at} as 


. ‘2 g(t) at iZ fie te- [fh | eo a dx. (17) 


Given any ¢ 7 0, we first choose p large enough so all P | f(x)| dx<e 
x| Pp 


and then tz large enough so that 


| |g(t)| dt<e. 
|t|2>t+p 


We then have 


JF [snd af 4 
fio Isnt hf A Ap cantOh 
<ul] (* tetor ar} dc | Usted} | eet ath 


<e\(" (ol ae (~ Lfla)| de. 


Hence the last integral in (17) approaches 0 as t > ©, and the proof of (15) is 
complete. 


b. Denoting the convolution of the functions f(t) and g(t) by f(t) * g(t), we can 
write (15) in the form 


[7 rose a= |" 0 a” ee a (151 
-o -o — 0 
Making the substitution t — x = ¢, x = ¢ — € in the “proper” integral 


h,(t) = { fla)e(t—x) dr, 


yey 


we get 
t+t 
A= [_ fe-Oelo) & 
t—¢ 
Then, taking the limit as t — 0, we find that 


fi) ai)= |" fodelt—s) de= |" fea &=e(0) «S00, 
i.e., the convolution of the functions f(t) and g(t) does not change if we reverse 
the order of the functions. 


11.59. a. The following test for uniform convergence of improper integrals is the 
natural generalization of the Abel-Dirichlet test (Theorem 11.23a) and works in 
some cases where the “majorant test” (Theorem 11.57a) is not applicable : 
tHeorem. Let f(x) be piecewise smooth on every finite subinterval [a, X| = [a, 


0), with derivative f(x), and suppose fix)— 0 as x — © while f'(x) is absolutely 
integrable on [a, ©). Moreover, let g(x, t) be piecewise smooth on every finite 
subinterval [a, X] = [a, ©) for every t© M, and suppose |G(x,t)|SC  (aSx* 
wo, t© M), where 


Gtwt)= | elGt) & (acx<eo) 


and C is independent of x © [a, ©) and t © M. Then the integral | 


f(x) (x,t) dx 


is uniformly convergent on M. 


Proof. The exact analogue of the proof of Theorem 11.23a, with the use of the 
Cauchy convergence criterion for uniform convergence of improper integrals 
(Theorem 11.56). 


b. Example. The integral 


x 


| sin xt 4) (t>0) 
0 2 


(calculated in Sec. 11.42h) is uniformly convergent on any interval 0 Sa <1¢S 
6. This follows from the Abel-Dirichlet test, since the function fx) =- 


approaches 0 as x — oo and has an (absolutely) integrable derivative 


ma 3 
i (x) = x? 
while 
| g(ét) dé |= | sin edt) EE ge 
a a a 


Using Theorem 11.54 to integrate with respect to ¢ from a to f, we get the 
integral considered in Example 11.57b: 


= Bj 
\, COS &xX — COs P ac— |" | sin xt ax} at 
0 x 0 io §=— ge 


E{Stefe-fo om 


11.6. The Gamma and Beta Functions 


11.61. By the gamma function we mean the function defined by the integral 


T(t) = [ected (1) 


0 


This improper integral of the third kind is the sum of an improper integral 


i 8) 
| t*1e-* dt 


1 


of the first kind and an improper integral 


1 
| tt 12-* dt 


0 


of the second kind, where the first integral converges for all real t and the second 


converges for all 7 0. Therefore (1) defines I'(z) for all r 7 0. 


Suppose 0 7 a 2 rt 2 £. Then the integrand ¢*!e~ has the integrable majorant 


pte TOe rel. 
ol!) = Pte if tpl. 


It follows that the integral (1) converges uniformly on every interval [a, 6], and 
hence, by Theorem 11.53, represents a continuous function of t on (0, ©), since 
t* !e~ is clearly continuous on every rectangle 0S a StS7,0SaSr SZ 
(why?). By the same token, the derivative * 'e~ In ¢ of the integrand with 
respect to t has the integrable majorant vit) “i 1e-! IIn ¢| ifO<t<l, 

the* ift>1, 


so that the integral 


@ 
| t*-1e-* Int dt 
0 


also converges uniformly on every interval [a, £]. Hence, by Theorem 11.55a, 
I'(c) has the derivative I(t) = iG Le-tingt dt (t>0) 
) 


since f° 'e™ In ¢ and its partial derivative with respect to t are both continuous on 
every rectangle 0 Sa StS 7,0 <a <1 SZ. Moreover, I''(z) is continuous for all 


t ~ 0 for the same reasons as the function I(z) itself. We can repeatedly 
differentiate I(r), each time applying Theorem 11.55a. Thus I (z) has derivatives 
of all orders. 


11.62. Integrating by parts in the definition (1) of T(z), we find that 


F(z) | gto ten* Dien ait es 


J0 tT 0 T 


= l s 
t*e ‘a= "| t'e* dt, 
JO T Jo 


so that 
| t'e~* dt=tI'(t) 
0 
or 
I(t+1)=tI(t). (2) 


Formula (2) is the basic functional equation for the gamma function. Applying 
(2) repeatedly, we get [(¢ +n) =(t+n- 1)(t+n- 2):--(c + 1)cT (a). 

Thus from a knowledge of the values of the function T' (z) on any interval of 
length 1, we can find its values on the rest of the half-line t 7 0. Moreover, since 


r= | e* d=, 
) 


we see that 
Tat l=na-1).. lant. 


It follows that the gamma function is a continuous extension to all positive real 
numbers of the function n! (n factorial) defined for all positive integers. Note 


. . 1 
that (2) implies ). (2) =lima + )_ -" 


tO tO 
since I(t) is continuous at t = 1. 
11.63. The beta function and its relation to the gamma function. 


By the beta function we mean the function of two parameters p and g defined by 
the integral 


B(p,q) = i P11 ox) Nde, (3) 


This integral exists for all positive values of p and q, as an ordinary “proper” 
integral if p, g 7 0 and as a convergent improper integral otherwise. The 
substitution , _ 


1+0 
transforms (3) into the form 
B pelt d0 (4) 
09 | ra 


Next we show how the beta function can be expressed in terms of the gamma 
function. Making the substitution ¢ = Oy in the expression (1) for the gamma 


function, we get I(t) > Le dy, 
0 


gt 
which becomes 
T(p+ hia oo 


after replacing t by p + g and @ by 1 + @. Then, multiplying both sides of (5) by 
@-' and integrating with respect to @ from 0 to n, we find that 


n = n oo 
(p+ 9) eer d§= ma QP—1ypta-1e-ye~ Oy a| d@, (6) 


where the left-hand side approaches I'(p + g)B(p, g) as n — ©, because of (4). 
As for the right-hand side of (6), the integrand has the integrable majorant 
nPlyP + TleY. Hence, by Theorem 11.54, we can reverse the order of 


integration, obtaining | . | ss QP-1ypt4-le-ye~ By a do 
0 \ Jo 


i [oortecters) [arte ao} dy 
0 0 

" [arteries Petes a| dy 
0 yr Jo 


— [ete dy, 


where the function 


ny 
F,(y) = | pn le-' a 


approaches the function 


0 if y=0, 
FO=\r yy if y>0 
as n — 0. This convergence is nonuniform on the interval 0 < y <b (as can be 


seen from the discontinuity of the limit function), but the convergence is indeed 
uniform on every interval 0 “© h S y S 45, since 
nh 


0<I(p)- ew dt<T(p)— \ gern ag? ai, 


Moreover, since F,,(v) is an increasing nonnegative function converging to I'(p), 
the set of functions »*~ *e~’F,(9) (n=1,2,...). 


has the integrable majorant y4~ ' e %T(p), and hence, by a suitable “sequence 


analogue” of Theorem 11A3 (give the details), 
im [ye 2Fy(3) HT) | 9? 
n-co Jh h 
On the other hand, 
h h ha 
O< [ye 9F0) H<T(p) | 91 =T U0) (7 
0 0 
for every h 7 0. Given any ¢ 7 0, first let h 7 O be such that 
we 
I'(p)—<- 8 
(<3, (8) 
and then choose N such that 
0<F(p) | rte? | tO) <F 9) 
h h 
for all n 7 WN. at the same time noting that 


0<F(p)F(q) -T(p) Le 16? dy 
h 


a ie hf € 
-r(p)( <I (—— <3 


; (10) 


0 


for our choice of h. It follows from (7)—(10) that 


0<T(p)F(@)— [9 'eF,(9) <0, 
0 


and hence that 


lim | “IF, (y) dy =T(p)E(g)- 
0 


Thus, finally, taking the limit as n — oo in (6), we get the desired expression for 
the beta function in terms of the gamma __ function: 
P'(p)1'(q) 
B(),¢) = ————— (11) 
U(p+q) 


11.64. Many trigonometric integrals can in turn be expressed in terms of the beta 
function. For example, making the substitution x = sin? @ in the integral 


n/2 
1- | sin?! @ cos?~!@ dé, 


0 
we get 
1 
dx 
I= xP D2] (q—1)/2 
\,* — cai 
-|, x Pl2)- 1(] —x)(/2)- oe B p q 
0 eels 
11.65. Settng g = I - p in (11) and recalling (4), we get 
oo gp~ 1 
I'(p)P(1—p) =B(p,l—p) = | —— dé. 12 
(O)E(-p)=Bie1-2) = | (12) 


The integral on the right can easily be calculated by using contour integration. 
To this end, consider the analytic function 


ait 13 
fz)=—, (13) 
defined on the complex z-plane cut along the positive real axis. The function 
(13) is uniquely determined if we set 
we 
0) = —_—— 
J (x +20) = aa (14) 


on the upper edge of the cut (x? ~ | being defined as in Sec. 5.53). Making one 


circuit around the origin in the counterclockwise direction, we get z = x — i0 = 
cer, and hence we set 


f(x—10) = 7 gatr~t) (15) 


on the lower edge of the cut. 
Now let Zp be the contour shown in Figure 80, consisting of the interval [0, 
R], R7 1 of the real axis along the upper edge of the cut, the circle Cp of radius 


R centered at the origin (traversed in the counterclockwise direction) and the 
interval [0, R] of the real axis along the lower edge of the cut (traversed from 


right to left). The only singular point of f(z) inside L is at the point z = — 1. 
Therefore, by formula (4), p. 407, 
f(z) dz=2m ee =2Qnie™P- 1), (16) 
Lr z=—-1 
J 


Figure 80 


while on the other hand, 
R 0 
f f(z) éc=| I(x +i0) éx+-| S(z) dc | S(x—i0) dx. (17) 
Lr 0 Cr R 


On Cp we have 


1 
— < AR?" 


Mal<s 


for some constant A and all sufficiently large R, and hence 


< AR?~?2nR=2nAR’~'=0 


| fz) dz 


as R — o. Thus, taking the limit as R — © in (17) and using (14)-(16), we get 
(1 — ant [P dx = 2xigP—"), 


0 l+<x 
and hence 
al od evi(p—1) 2Qnt 
a k= iy iaitp= 1) ~ 2 Rip 1) _ ep 1) 


ae. ee. ae (18) 
sina(p—1) sin mp 


Finally, comparing (12) and (18), we obtain the following “complementation 
formula” for the gamma function: 


r(p)F(1—p) = B(p,1—p) =— (19) 


sin 1p 


In particular, setting p = 2 in (19) gives 


Q 


which together with the basic functional equation (2) implies 


rG)-iG)-h 


and more generally, 


Qn+1\ _ (2n—1)(2n—3)---3-1 
r( 2 )- Soe ve 


The graph of the gamma function I (x) for 0 <x <5 is shown in Figure 81. 


11.66. Using the gamma function, we can easily evaluate the following 
important integral encountered in probability theory: 


l= | x™e-" dx = (a>0, m> —1). 
0 
In fact, the substitution 

: 1 dt 
ax =t, x= |-, == 
a 2 Jat 


reduces the integral /,, to the form 


Figure 81 


1 wanyees 1 m+1 


In particular, we have 


sal as 
ee aa ey be oe 
0 x Ix (5) av t 


11.67. Asymptotic representation of the gamma function. We now find an 
asymptotic representation of the gamma function r(t)= me 1-1 at 


0 
for large values of Tt. 


a. Lemma. Letf(x) be a nonnegative function definedfor x 7 0, such that (a)f(1)= 0; 
(b) f(x) is decreasing for 0 <x + 1 and increasing for x 7 1; (c) fx) has the 
representation 


fe) =aeq ly peal we), 


where a” 0,67 0,| ¥ (x) | <M in a neighborhood of the point x = 1; (d) f(x) 
satisfies the inequality 


If(x)|26x  — (b>0) (20) 


forx%c\1. 
Then the function 


I(s) = | ae dx (21) 


satisfies the limiting relation 


lim =0. (22) 


Proof. The convergence of the integral (21) for s 7 0 follows from the estimate 
< 


(20). For 0 € < 1 we have 
ia l+e c oo 
1=4 | +| +| +| hee a (23) 
0 i-s l+e c 
By hypothesis, 
[oe dx <e~SUL- 9) & g—sa/2)e? (24) 
0 
| e~ SO) dec (c—1)e~ FF) < (6 — 1) 0/20 (25) 
1l+e 


for sufficiently small ¢, while 


co co ge” Sbe 
| e SF) dx< | e7 5bx dx= ; (26) 
0 c sb 

With a view to calculating the integral from 1 — € to 1 + €, we note that 


Je" 2-9 28468) fm Jo — 1) 2428) (x) +457 — 12242942 (4) ++ 
< Mse”*?9 +4M757g7(2 +25) 4... < 2Mse?*?4 


on the interval [1 — s, 1 + ¢], provided only that 
Mse**?* <, (27) 


So far, the number ¢ is arbitrary. Now let 
gas  1M2+0) 

for large s. Then 
ge? +28 — gt (2+ 28)(2 +6) _ 5 8/(2+8)_, 6 


as s—oo, and the condition (27) is indeed satisfied. On the other hand, for this 
choice of ¢, 


se? = si 22 +8) — 82+8)_, op 
as s — oo. Hence 
|e~ 8e- 1)?+23y(x) l| <2Mse?*?440 


for any x © [1 —«, 1 + €] as soo, so that e~8(*~ 1)? *7°W) — 1 + 0/1), 


where o(1)—0 as s— o (cf. Sec. 4.32). Therefore 
ite l+e . 

| eS) i= | e Sax— 1) [l +o(1)] dx 
1-e l—e 


=[1+0(1)] ow 


e,/Sa 


=[l+o(1 =| edu. 
5 
But &,/'s 00 as s—oo, and hence the last integral approaches the limit 
co ae = 
| e du=,/ T 


as s—>oo (see Sec. 11.66), so that 


re du =[1 +0(1)]./z, 


—e,/sa 


where o(1)—0 as s— o, It follows that 


it+e nT 
| e~ S) é=pi eon |. 
1l-e Sa 


The other terms in (23), namely (24), (25), and (26), all approach zero 


exponentially as s—00, Thus, finally, 


I(s)= ae dx=[1 +o(0) |E-+o([) - | [1 +0(1)], 


which is equivalent to (22). 


b. We now transform the expression for the gamma function into a form suitable 
for application of the lemma. Making the substitution ¢ = sx, we get 
T(s+ 1) a | te~! dt=s8*! | x5e 5* dx=si*1 | ge x tsinx dx 

0 0 


0 - 


co 


=st oe A sa densities! e FOC) dx, 
0 0 
where 
fix) =x-Inx-1. 
It is easy to see that f(x) satisfies all the conditions of the lemma. Clearly 
f1)=0, f(l)=0, f)=1, S£°()=-2, and hence 
{id= tie itt ety lim E=1 
2 3 x~1 


in a neighborhood of the point x = 1, so that the constant a equals 4. Therefore, 
applying the lemma, we get the desired asymptotic representation of the gamma 
function: + 


T(s+1)=s8t1e75 2p +00)]~ /Qnssse-8. (28) 
Vs 
In particular, if s =n is a positive integer n, then I(n + 1)=n! and (28) reduces to 
Stirling ’s formula 
n'~./2nnn"e™" (28") 


(dating from 1730). 


11.68. The gamma function in the complex domain. The formula 


r(e)= | tet dem | gam tk ae (29) 
0 0 


defining the gamma function is applicable not only for positive real z, but also 
for certain complex values of z. In fact, the integral (29) is also convergent if z = 
x ar iy, x a 0, since the integrand 
eixtiy- 1)In t,-! a 1)In t,—tpiyln a i 1,~t,iylnt (30) 
differs from the function %~ 'e™ only by a factor of absolute value 1. Thus (30) 
defines I'(z) directly for all z =x + iy with x = Rez > 0, ie., for all z in the whole 
(open) right half-plane G. Moreover, the integral (29) converges uniformly 
inside G, since the quantity @ —_ Re z 


is positive for every compact set O = G, so that (30) has the integrable majorant 
(1~ 1e~t for all z © QO. It follows from Theorem 11.55b that I'(z) is analytic on G. 
Next we examine the possibility of continuing the function I'(z) analytically 
into the left half-plane (cf. Sec. 10.39c). To this end, we use the formula 
P(z+n) =2(z+1)---(z+n—1)P(z), (31) 


proved for positive real z in Sec. 11.62. Since the two sides of (31) are obviously 
analytic on G and since they coincide on the positive real axis, it follows from 
the uniqueness theorem for analytic functions (Theorem 10.39b) that they 
coincide on the whole half-plane G. Writing (31) in the form 


_ ['(z+n) 
dere | Sey Or (92) 


we observe that the right-hand side is defined and analytic on the domain G,, = 


{z: Rez? —n}. The formally distinct definitions obtained for different values of 
n all give the same value of I(z) for any given z, by the uniqueness of the 
analytic continuation. Since we can make n arbitrarily large, this defines I'(z) 


everywhere in the z-plane except for isolated singular points z = 0, — 1, ..., —n + 
1, ... It is clear from (32) that these singular points are all poles of the first order. 
To calculate the residue of I'(z) at the point z = — n + 1, we use formula (3), p. 
406, obtaining _ T'(z+n) 
ht Patan O hues 
(1) f=1)e"* 


(=1)(—2)-(—n+1) (n—1t’ 


Problems 


1. Prove Dirichlet’s formula 


1 = 5 
| {| PPM aay) Yea) oh a 
o (Jo 
1 i-y ot 

= [h(a tax)" Mes) ash & 
where 04S 1,0 <p S1,0<v 1 and fix, y) is continuous for 0 <x <1,08 
< 
ySl. 


2. Prove Frullani’s formula 


0 


| fel) dx =f(0) in? (a>0, 6>0), 
where f(x) is continuous for 0 <x Sco and the integral | * f(x) he 
1 x 


converges. 


3. Use integration in the complex plane to evaluate 


I(p)= |e as, 
0 
where Re p 2 0. 


4. Evaluate the Fresnel integrals 
F,= | sin (x*) dx, F,= | cos (x?) dx. 
0 0 


5. Evaluate the singular Fourier integral 
I= | ” P(x) sin x 
-wQ(x) x 
where P(x) and Q(x) are polynomials such that Q(x) has no real zeros and the 
degree of P(x) does not exceed that of O(x). 
6. Prove that 


ihemse}e=— 
1 co t—x 
kth ct si 


Why doesn’t this contradict Theorem 11.54? 
7. Starting from the integral (18), p. 467, prove that | °1l—cos Bx , 7B 
eT a aaa o 


0 x 


and hence that 


8. Prove that 

b 

dx 

| =X (a<b). 
9. Prove that 

n/2 - 
| In sin x dx= —~—In 2. 

0 2 


10. Prove that 


11. Prove that if the function f(x) is continuous and has a bounded integral 


@(x) = | fx) dx 


for all x 2 a, then the integral (2 dx (a,a> 0) 
2. < 


converges. 


12. Use residues to evaluate the integral 


i) x?™ ik 
\ 7 (m<n), 


where m and n positive integers. 


13. Let L be the path in the plane of the complex variable s consisting of the 
interval [€, 0) of the real axis traversed from right to left, the circle |s| = ¢ 
traversed once in the counterclockwise direction, and the interval [¢, ©) 


traversed once again, this time from left to right. Prove that the formula 


l 
I'(z | oe 
(z) lz] by s s 


. 


represents the gamma function for all complex z except the poles z = 0, — 1, — 2, 


+ Here we anticipate the almost trivial Theorem 11.21a. 

+ In this case, the function f(x) is said to be absolutely integrable on [a, ©). 

+ We couid also apply Leibniz’s test directly to the integral (6). 

+ Note the similarity between this theorem and the Abel-Dirichlet test for series (Theorem 6.47c). 
+ Here we allow f(x) to become infinite at the point x=a, but nowhere else. 


+ As an exercise, the reader should state (and prove) the Cauchy convergence criterion, the comparison test, 
the definition of absolute convergence, and the analogue of Theorem 11.21a, the Leibniz and Abel-Dirichlet 
tests, etc., for improper integrals of the second kind. 


+ Instead of [a, b] we write (a, b] if a = —oo and [a, b) if b = +00. 

{ Here we use the term “singular point” in a sense different from that of Sec. 10.37c. 
+ For brevity, we omit the integrand f(x) in the intermediate steps of the calculation. 
+ Note that in this case the complex Fourier integral 


oo glox 
Lae 
fails to exist. 
+ As in Theorem 9.114, ®'(¢) denotes the derivative of @(t) and /;(x, 4) the partial derivative of f(x, 2) with 


respect to f. 


+ Here we use the fact that the continuous functions f(x) and g(x) are uniformly continuous on every closed 
interval (see Theorem 5.17b). 


+ The symbol ~ denotes approximate equality (for large values of the argument). 


Appendix A 


Elementary Symbolic Logic 


A.1. Logicians have long made use of an economical notation for writing out 
mathematical demonstrations. We now give a brief sketch of the simplest and 
most useful notation of this kind. 

Suppose we are interested not so much in the concrete nature of a given 
proposition as in its relation to other propositions. Then we can designate the 
proposition by a single letter, such as a or £. We now introduce the following 
notations, each with the meanings given in parentheses: (1) a => fh (“the 
proposition a implies the proposition f”’); (2) a = f (“each of the propositions a 
and £ implies the other, i.e., a and # are equivalent propositions”); (3) Wx © E: a 
(“for all x: © E the proposition a is true”); (4) 4x © E: a (“there exists an element 
x © E such that the proposition a is true”). 


For example, saying that “the number ¢ is the least upper bound of the set E” 
means that the following two conditions hold (see Sec. 1.24): (a) Vx © E Sé 
(“the inequality x S € holds for all x © E”); (b) Va 2 E: a 2 & (“every a greater 
than or equal to every element of EF is greater than or equal to &”). 


A.2. Let ¥ denote “not a,” i.e., the negation of the proposition a. Then, clearly, 
Xa, 

(a> p)<>(B>a), 

(a<>p)<>(a<>P). 


We now construct the negation of the proposition Vx © E : a (“every x © E 
has the property a’). If the proposition in question is false, then the property a 
does not hold for every x © FE, i.e., there exists an element x © E which does not 
have the property a. Therefore Vx € E: ac>dx € E: &. 

The negation of the proposition Ix © E : a (“there exists an x © E with the 
property a”) can be found similarly: If the proposition is false, then there is no 
such x © E, ie, every x © E fails to have the property a. Therefore 
dx ce E: aeeVx ec E: &. 

Thus putting an expression under the overbar has the effect of changing V to J or 
J to V, and then replacing the property appearing after the colon by its negative. 


For example, the negation of Condition b above is just 
(b) Va>k: a>Eda>E: a>l<erja>E:a<t 

(“there exists an a greater than or equal to every x © F and less than €”). Writing 
out a 2 E in more detail, we get (b) Ja S & Vx © E: x Sa. 


A.3. Consider the proposition 
(c) We70:4xF E,x7E-e 


(“for every ¢ 7 0 there exists an x © E greater than é — ¢”). The negation of this 
proposition is given by (c) Je>O: Vx € E, x>€—e<ede>0: Vx € FE, xg f—e. 


Replacing ¢ — € by a, we get 


(©) da <= Wx © E, x <a, which is identical with Condition °. It follows that 
Conditions c and b coincide. Thus, in defining the least upper bound, we can 
replace Condition b by Condition c. 


A.4. We have just seen how operations involving logical symbols can have 
interesting implications. Naturally, the arguments given above could have been 
carried out without using logical symbols at all. However, “symbolic logic” of 
the type just described is often very useful. Although we have avoided using it in 
the text (where maximum economy of style was not an objective), we 
recommend that the reader make use of it in his own work. 


Appendix B 
Measure and Integration on a Compact Metric Space 


In this appendix, we sketch the main features of a general theory of measure and 
integration on a compact metric space, stating a number of theorems without 
proof. 


B.l. We begin by defining the Riemann integral on a compactum, i.e., on a 
compact metric space (Sec. 3.91). A compactum K is said to be weighted if K 
has a family of subsets, called cells, with the following properties: 

(1) K itself and the empty set are cells; 

(2) The intersection of any pair of cells sharing interior points is a cell; 


(3) If a cell Q is contained in a cell P, then P can be represented as a union of 
nonintersecting cells with Q as one of the cells; 


(4) Given any 6 > 0, K is the union of a finite number of cells sharing no interior 
points (Sec. 3.21), all of diameter less than 6 (Sec. 3.12a); 


(5) Every cell QO is assigned a nonnegative number m(Q), called the measure of 


OQ; 


(6) If Q is aunion of cells Q),..., O,, sharing no interior points, then 


m(Q 1) =m(Q)) + + mQ,), 


a formula called the (finite) additivity condition. 

The (Riemann) integral over a weighted compactum K is defined by analogy 
with the definition of the integral over a closed interval [a, b], as described in 
Sec. 9.1. Let f(x) be a function defined on K, and let II be a partition of K into a 
finite number of cells Q;, ..., Q, sharing no interior points. Moreover, let d(I1) 


be the maximum diameter of the cells Q,, ..., O, Choosing an arbitrary point ¢, 
in each cell Q;, we form the Riemann sum 


Su(f)= ¥ SE)m(Qu)- (1) 


Then a finite number / is called the (Riemann) integral of the function f(x) over 
the weighted compactum K if, given any ¢ 7 0, there exists a 6 7 0 such that 

U- Sf) < 

for every partition I with d(1) < (cf. Sec. 9.13). In other words, the integral of 
fix) over K is the limit of the Riemann sum (1) under arbitrary refinement of the 
partition IT. If this integral exists, we say that f(x) is integrable on K. Just as in 
Sec. 9.13, this definition is comprised in the general scheme of a limit in a 
direction 7. Moreover, we have the following analogue of Theorem 9.14e, 
proved in virtually the same way: Jf f(x) is continuous on K, then f(x) is 
integrable on K. 


B.2. Examples 


a. Let K be the interval [a, b] equipped with the usual metric, and let the cells be 
all possible intervals a Sx SB, a <x <P, aSx< Ba <x < B, where a SZ, 
with the length of each interval being chosen as its measure. Then the integral 
over K reduces to the ordinary integral over [a, b], as defined in Sec. 9.13. 


b. Let K be the “closed n-dimensional block” specified by the inequalities 

ay Sx, ~ Diss +5) Sx, ~ b, 

and equipped with the metric of n-dimensional Euclidean space (see Sec. 3.14a). 
Let the cells be all possible “subblocks” Q — K specified by inequalities of the 
form 


ay Sx, ol EE mx, mf 


and also by the inequalities obtained by replacing some or all of the signs by < 
. As the measure of Q), we choose its “n-dimensional volume,” i.e., the number 


m(Q)= TL By, — %)- 


Then it is easy to see that this system of cells and measures satisfies Properties 
1—5 above. In this case, the corresponding integral is called the integral over the 
n-dimensional block K. 


B.3. A set Z K is said to be a set of (Jordan) measure zero if, given any ¢ 7 0, 
there is a finite collection of cells Q,, ..., Q, such that every point x © Z is an 


interior point of 


Di 


and 


x m(Qy) <e. 


There is a theorem, generalizing the theorem on integration of a piecewise- 
continuous function (Theorem 9.16c), which asserts that if f(x) is bounded on a 
weighted compactum K and continuous outside of a set of measure zero, then 
fix) is integrable on K. 


B.4. Next, with every set E — K we associate a function 


‘ ifx € E, 


re(*) =) 9 ie y ¢ E, 


called the characteristic function of E. A set E 1s said to have volume (or Jordan 
measure) if the function yp(x) is integrable, and we then call the number 


mE) = I(Xz), 
where /(7,;) is the integral of y,, the volume (or Jordan measure) of E. Every set 
of measure zero has zero volume (why?). It follows from the basic properties of 
the integral that m(E,) S m(E;) if E, S E, (provided that m(E,) and m(E}) exist) 
and that if 

E=E,uU:VE,, (2) 
where the sets EL, EF), ..., E,, all have volume and the sets £), ..., E, share no 
interior points, then 

m(E) =m(E,)+---+m(E,) (3) 
(volume is additive). 

Which sets E — K have volume? This question can be answered by 
examining the boundary of FE. By the boundary of E we mean the set of all 
points x © K which are limit points of both E and its complement (i.e., of all 
points x © K such that every neighborhood of x contains both points of E and 
points of K — £). It turns out that a set E - K has volume if and only if the 
Jordan measure of its boundary is zero. Sets with volume will henceforth be 
called Jordan sets. In particular, every cell QO is a Jordan set whose volume is its 
originally assigned measure m(Q). Clearly (2) implies (3) if E, £), ..., E,, are all 


Jordan sets and if £, ..., £,, share no interior points. 


B.5. A simple but “sufficiently rich” class of Jordan sets can easily be found in 


the case where K is an n-dimensional block (see Example B.2b). In fact, it turns 
out that a bounded set E = K is a Jordan set if its boundary is the union of a 
finite number of “surfaces” with equations of the form 


N= DM peices eg ine ys 
where @, is a continuous function of the point (x), ..., x, — 1, X%; 44, ---» X,) defined 


on some domain of (7 — 1)-dimensional space. In particular, every “polyhedron” 
(1.e., every set bounded by a finite number of planes) is a Jordan set. 


B.6. Two sets F and G in the n-dimensional Euclidean space R, are said to be 
congruent if there exists an isometric mapping of R, into itself carrying F into G 


(Sec. 3.15b). It can be shown that congruent Jordan sets have the same volume. 
This is true in particular if G obtained from F' by a shift or by reflection in some 
plane (Sec. 3.15a), or by a rotation (Sec. 5.75b). Moreover, if a Jordan set G is 
obtained from a Jordan set F' by A-fold expansion along some axis (Sec. 2.67), 
then 


m(G) = Am(F), 
while if G is obtained from F' by /,-fold expansion along the x,-axis, 1,-fold 
expansion along the x-axis, and so on, then 


In particular, if G is obtained from F' by A-fold expansion along all axes, i.e., if G 
is similar to F with ratio of similitude A, then 


m(G) = A"m(E). 


B.7. In the plane (n = 2), the curvilinear trapezoid ® considered in Sec. 9.21, 
bounded by the x-axis and the curves x = a, x = b, y =f (x) 20 is a Jordan set if 
the function f(x) is continuous. + Let ®, be an elementary figure made up of 


circumscribed rectangles, as in Figure 22, p. 288, while ®, is an elementary 
figure made up of inscribed rectangles, as in Figure 23. Then ®, =o= ®,, and 
hence 


m(®)) Sm(®) S m(®,), 
as in Sec. B.4. The proof that 


b 
m(@)= | fx) dv (a 


now follows by the argument given in Sec. 9.21. Moreover, in keeping with (2) 
and (3), the area of a figure made up of a finite number of curvilinear trapezoids 
(or figures congruent to such trapezoids), is just a sum of integrals like (4). Such 
a figure, made up of four curvilinear trapezoids, is shown in Figure 82. 

Thus the theory of area in the plane reduces largely to the evaluation of 


Figure82 


Riemann integrals. However, it should be noted that this theory is not capable of 
answering all the questions that naturally arise. In particular, if F is a union of 
sets with volume, then the present theory guarantees the existence of m(E) only 
in the case where E is a union of a finite number of sets. The case of a countable 
union of sets is allowed in the more general (but more complicated) theory of 
Lebesgue measure. + 


+ In this case, the curve y = f(x), aS<.x Sb automatically has Jordan measure zero (see Sec. B.5). 
+ See e.g., G. E. Shilov and B. L. Gurevich, Integral, Measure and Derivative: A Unified Approach, Part 3. 


Selected Hints and Answers 


Chapter 1 


2. Hint. (a) It follows from |x| = |x — y + y| |x — y| + |p| that |x| — ly] S |x — y). 
Now interchange x and y. 

4. Hint. First prove that the square of an integer is divisible by 3 if and only if 
the integer itself is divisible by 3. 

5. Ans. /2 + /6. 

6. Hint. Obviously (a — 1)* 2 0, and hence a? — 2a + 1 2 0 or equivalently a? + 
1 2 2a. 

7. Hint. Use induction and Problem 6. 

8. Hint. Apply Problem 7. 

9. Hint. Apply Problem 8. 

10. Ans. (a) x =—1, 3; (b) x = 2; (c) x = —2; (d) x = —2, F. 
11. Ans. The interval (—3, 15). 

12. Ans. The union of the two intervals [—4, —1) and (1, 8]. 

13. Ans. If a= 0, max A = min A = sup A = inf A = 0. Ifa =1, max A= min A = 
sup A = inf A = 1. If 0 Sa <1, max A = sup A =a and inf A = 0, but min A does 
not exist. If a7 1, min A = inf A = a and sup A = ©, but max A does not exist. If 
a=~—1, max A= sup A = 1, min A = inf A =-1. If a is negative and 0 <|a| <1, 
then min A = inf A = a and max A = sup A = a”. If a is negative and |a| 7 1, then 
sup A = 00, inf A = —oo, but max A and min A do not exist. 

14. Hint. Let x = 4, say. 

15. Hint. If 


p=44 | ee 2] 
sar RO 


then a < f and hence a2 < af = tet. 

17. Ans. (a) $3; (b) $34; (c) “e007; (d) —*s's 

18. Hint. Verify that the number sup 4 + nee B has the defining properties of the 
least upper bound of the set 4 + B. 

19. Hint. Verify that the number sup A - sup B has the defining properties of the 
least upper bound of the set AB. 

20. Hint. Do the same thing as in Problems 18 and 19. 

21. Hint. The set A is bounded from above and y = sup 4. 


Chapter 2 


1. Hint. Use Theorems 2.34 and 2.35. 

2. Hint. (a) A rational point can be chosen in each interval; (b) A point with 
rational coordinates can be chosen in each of the two loops of the figure eight; 
(c) Only a finite number of points of M can lie outside every interval [0, é]. 

3. Hint. Use a construction like that in the proof of Theorem 2.33. 

4. Hint. There are countably many brothers N. 

5. Hint. Let C be a countable subset of A, and let D= A -—C. Then dA =D+C,A 
+B=D+C+B. Now use the equivalence of the sets C and C+ B. 

6. Hint. Use Problem 5, with A = / and B the set of all rational numbers. Treat 
the case of the set T similarly. 

7. Hint. The set A is the union of two sets, the set of all sequences containing 
only a finite number of zeros and the set of all sequences containing infinitely 
many zeros. The first set is countable, while the second set has the power of the 
continuum, being in one-to-one correspondence with the set of points of the unit 
interval written in binary notation (see Sec. 1.78). Now use Problem 5. 

8. Hint. Every sequence 1, n>, ... can be associated with a sequence consisting 


only of zeros and ones, with ones in the places with numbers 77, 1, .... Now use 
Problem 7. 

9. Hint. Every sequence nj, m5, ... can be associated with an increasing 
sequence ky = 11, ky =n, + No, ..., ky) =Ny +... FH Myy oe 


10. Hint. With every sequence ¢j, ¢5, ... we can associate an array 11) = 1), ..., 


Nim +s 


N71 = NI2s «+09 Nom +--+ 


whose elements are natural numbers (first write ¢, as a decimal, say, then ¢, and 
so on indefinitely). This array can be written as a single sequence, as in the proof 
of Theorem 2.33. Now use Problem 9. 

11. Hint. Make an appropriate simplification of the solution of Problem 10. 

12. Hint. Suppose we associate a function /,(x) © E with every point ¢ © [0, 1]. 
Then the set WS E of all such functions }, (x) does not exhaust the whole set E, 


since W does not contain the function g (x) whose value at every point x differs 
from f(x). 

13. Hint. Generalize the solution of Problem 12, replacing [0, 1] by A and 
specifying every subset B— A by a function equal to 1 on B and 0 outside B. 


Chapter 3 


1. Hint. If A = {1, 4, 4,...}, then A’ = {0} while A” is empty. Continue the 
construction recursively. 

2. Hint. Every limit point of A’ is a limit point of A. 

3. Hint. It is enough to consider the case n = 1. A point of the set A which is not 
a limit point of A can be covered by an open interval with rational end points 
containing no other points of A. Now use Chapter 2, Problem la. 

4. Hint. A point which is not a condensation point of the set A can be covered 
by an open interval with rational end points containing at most countably many 
points of A. 

5. Hint. “Mark” all intervals with rational end points contained in the intervals 
of &, and then keep an interval of ¥ containing each such “marked” interval. 

8. Hint. Let 
G,= U {ve M: p(x,9) <t p(s,F2)}, 


xeF, 
and similarly for G. 
10. Hint. Use a pair of neighborhoods to make the construction if n = 3. The 
space M consisting of four points x), Xx, x3, x4 with distances 
P(X, .%2) = P(X2.%3) = p(%3.%4) = p(%4,%,) =1, 
p(*15%3) = p(*2,%4) =2 
already fails to be isometric to any subset of the Euclidean space R3. 


11. Hint. Consider the sphere S,, — R,, ,. ; of radius 2 centered at the point (0, 0, 
..., 0, 1). Map the space R,, = R,, +1 Of all points (¢), &,...,¢,, 0) onto the sphere 
S_, using “straight lines” going through the point (0, 0,..., 0, 2), and make this 
latter point correspond to the point 0 © R. Then for the metric 7 choose the 
usual metric for the corresponding points of S,, in the space R,, , ; (“stereographic 
projection’). 

12. Hint. Given any three points a, x, y © M (with a fixed), find a triple of points 
A, X, Y in the Euclidean plane R, the same distances apart (cf. Problem 10). The 


new metric r (x, y) 1s then defined by stereographic projection of R, onto the 
sphere S, tangent to R, at the point A. 
13. Hint. Two elements on every line through the origin of coordinates. 


Chapter 4 


1. Hint. See Chapter 1, Problem 1. Consider the sequence x, =(-1)" (n= 
Oe 

2. Hint. Note that lim and Iim are limits of certain sequences. 

3. Ans. Yes. 

S. Hint. If a, > lim q, < lim a,, let A “1; for n=a,, k=1,2.,..., 

lim a, —lim a, forn#m, 

say. 

6. Ans. $a + $b. 

7. Hint. If 


= e* e fn 
ee Oey Se 


for sufficiently large n. 


8. Hint. Note that y,, Sy, 41 © %)41 Xp 


9. Hint. Suppose p; =... =p, 874.46 ae 2 Dy: Then 


ifk™ y. 
12. Ans. No, in both cases. For example, let 
X= {1, 2,...,0,...} S={n— oo} 


Y= JOljrnpont Z={0,1}, 
n 


and let 

y, Qn)=1/2n,  y, Qn+1)=0 

yo(n)=0 (n=1,2,...), 

z(0) = 1, z(y)=0ify+0. 

Then z(y,(7)) has no limit as n — ©, while z(y>(n)) has the limit 


1 flim z(y). 

y~0 
13. Hint. Parts a and b can be verified directly, while Part c follows from 
Theorem 4.16c. 


Chapter 5 


1. Ans. No. 

2. Ans. No. 

3. Hint. Use Chapter 1, Problem 1. 

4. Ans. Both functions are discontinuous for integral values of x and continuous 
for all other values of x. 

7. Hint. Note that 


n 
:. ¢, <max SC us caplet 
k=1 


= /|— 


min ae a < 


8. Hint. Use the existence of f(a + 0) and f(b — 0). 
9. Hint. Set x = log. t and apply Theorem 5.41. 


10. Ans. [—00, —1 —8], [-1 +8, 1 — 8], [1 +8, «] for any 5 7 0. 
11. Hint. Consider the pairs of functions sin ax, cos ax and a* sin x, a* cos x. 
12. Hint. Consider the function 


ian if x40, 
Pe = x 
0 if x=(. 
13. Hint. Use formulas (1) — (3), p. 157. 
14. Hint. Every point c with the property that 


lim /(x) —/f(c) 


xc 


2q>0 


has a neighborhood in which there is no other point with the same property. 

15. Ans. The function x(7) is continuous everywhere except at points such that all 
the digits ¢,), t,>, ... are nines, beginning with some index ny. 

16. Hint. If an arc Ay —T contains no points of M, then the ares A, = A- 1, Ay = 
A-— 2, ..., obtained by shifting Ag back | unit, 2 units, ... along I’, also contain 
no points of M. But the arcs Ap, A), Ao, ... =T are all of equal length, and hence 
some pairs of arcs must intersect. If A; intersects A; , ,,, Say, then the union of 
the arcs A;, Ap + m> Ax+2m --- covers the whole circle I. 

17. Hint. Let q be any point of the set M—P, and let p,, be a sequence of points of 


P converging to g. Use the uniform continuity of f (x) on P to show that the 
sequence f (p,,) has a limit which is independent of the choice of the sequence p,, 
— q. Setting f(q) = lim, _, . fp,), verify the continuity of f(g) on the whole set 
M. 

18. Hint. The intervals (f (x — 0), f(x + 0)), constructed for all discontinuity 


points of f (x), are pairwise nonintersecting (i.e., no two of the intervals 
intersect). 


Chapter 6 


1. Hint. Use Raabe’s test (Theorem 6.17a). 

2. Hint. Use Problem 1 and the tests of Sec. 6.1. 
‘ < oo 

3. Hint. For any n and m™ n, (n—m)a,<a,,, +0 +4, <q = PH , ie 


and hence 


na, < Fa 


n—m 
4. Ans. One solution is to find N = Mn) for each n = 1, 2, ... such that 
= l 
Xan n>’ 
and then set b,, =n for N(n) Sm <N(n+1). 
5. Hint. Use the inequality k(n — k) Sn? (k= 1, 2, ...,n — 1). 
6. Hint. Son < S al ar 1 = Son + Aon + l- 


7. Hint. Use D’ Alembert’s test (Theorem 6.14a). 
8. Hint. Clearly 


es) 


L=limf(t)< ¥ a, 
t1 


where the existence of L follows from the fact that f(t) is nondecreasing and 
bounded. On the other hand, given any ¢ 7 0, we can choose N such that 
N 


afterwards choosing ¢ such that 


Dat > 2 a,—8. 


9. Hint. The idea is the same as in Problem 8. 
10. Hint. z, = P,4,/P,,. 


11. Hint. P= y log, we 
k=1 


12. Hint. Use Theorem 5.58c. 
13. Hint. Calculate (1 — x)P,,. 


14. Hint. Expand 


] 
1 —(1/p%) 
in a geometric series. 
15. Hint. Euler’s formula continues to hold for x = 1, in which case both sides 
become infinite. 
16. Hint. There are 9™ — 9” ~ ! integers between 10™~ ' — 1 and 10™- 1 


containing no nines at all. Hence the sum in question is less than 
9-1 97-9 93-9? —80 

~~ te” MZ: 

17. Hint. The length of a vector is no less than the length of any of its projections 
and does not exceed the sum of the lengths of all its projections on the 
coordinate axes. 


18. Hint. Use the compactness of a sphere in R,,. 

19. Hint. Use Theorem 6.31 (cf. Sec. 6.45). 

20. Hint. Let V, > --- > V,,> --- be a sequence of solid angles containing the 
vector g such that sup sin (x,e>)=0,\,0 


xeV », 
|x] =1 
and 
ys o,< 0. 
n=l 
Now consecutively choose terms of the series (2) contained in /;, ..., V,, ... the 


sum of whose lengths lies between 1 and 2. 
21. Hint. The proof is by induction. For n = | the assertion reduces to Riemann’s 
theorem (Sec. 6.37). Let fbe any vector in R,,, and let R,,, be the orthogonal 


complement of f, i.e., the set of all x © R,, such that (x, f) = 0. By the induction 


hypothesis, there exists a rearrangement of the series (2) the sum of whose 
projections onto R,, _ ; equals a given vector in R,, _ ;. Using Problem 20, we 


choose a part of the series (2) for which the components along f form a series 
converging to + oo while the components orthogonal to f form an absolutely 
convergent series. The components of the complementary part of the series along 
f then have the sum — ©. By rearranging these parts we can obtain any desired 
component along f But, by Problem 19, this has no effect on the sum of the 
projections onto R,,_}. 


22. Hint. Apply the result of Problem 21 to the orthogonal complement of the 
subspace A, i.e., to the set of all x © R,, Such that (x, y) = 0 for all y lye 


Chapter 7 


1. Hint. f(x) = 0. 

2. Hint. Use Lagrange’s theorem (Sec. 7.44). 

3. Hint. Use Lagrange’s theorem. 

4. Hint. y'(0) = 0, but y’(x) does not approach 0 as x 0. 

5. Hint. It suffices to consider the case f’ (a) < 0 < /' (b), proving the existence 
of a point c © (a, b) such that f(c) = 0. This is just the point where f(x) achieves 
its minimum. 

6. Hint. Give a proof by contradiction, using Lagrange’s theorem. 

7. Hint. If the inequality (1) holds for two points A = (A,, ...,4,) and w= (Wy, ..., 
,) 1n n-dimensional real space R,, (Sec. 2.6), then it holds for the entire segment 
in R,, connecting J and yw. The inequality is obvious for the points e, = (1, 0, ..., 
0), eo = (0, 1, ..., 0), ..., e, = (0, 0, ..., 1). From this deduce its validity for an 
arbitrary point A = (A), ..., A,). 

8. Hint. The inequality 


fe) fle) f.B) =fia) 


ae or (a<x<f) 


is equivalent to the inequality (1). 

9. Hint. Taking the limit as B* in the — inequality 
LOS) < AB) fa) < FB) -fl#) leexei 

x—o B-a p-x 

proves the existence of f(x) and the formula f(x + 0) = f(x), while taking the 
limit as a 7 x proves the existence of /i(*) and the formula f(x — 0) = f(x). 
10. Hint. Use the same inequality as above. 
11. Hint. The intervals (/7(*),f+(*)), constructed for all points at which f(x) fails 
to exist, are pairwise nonintersecting. 
12. Hint. It follows from (2) that the only points which the curve y = f(x) shares 
with any of its chords are the end points of the chord. Hence y = f(x) must either 
lie above the chord or below the chord. 
13. Hint. If there is a point (x1, (x;)) on the curve y = f(x), x a Xg going below the 
right-hand tangent, then the curve must go below the chord joining the points 
with abscissas x, and x, Now recall the inequality (5), p. 234. 


14. Hint. Apply L’Hospital’s rules to the limits given in Chapter 4, Problem 10. 
15. Hint. Apply Lagrange’s theorem to the ratio 

S(a+h) —f(a) 

——a 

16. Hint. The slope of the tangent at the point x, = b/n cannot exceed the slope of 
the chord going through the points x, and x,, , ;. 


17. Hint. Fixing m for given Xo, let the increment be h = + '4". Then the 
increments of all the functions g,(x) vanish, starting from the mth. The function 
P»-\(x) has intervals of length 2/4™ without “corners.” The interval containing x 
also contains one of the intervals | l | | ih | 

XoXo + a | Xo »*o |. 

4" 4” 

None of the preceding functions @,(x) with n < m- | Has “corners” in this 
interval, and the increments of g,,(y) equals the increment of x in absolute value. 
It follows that 
Af(xo) _— f(xo+h)—f(xo) "G'Ag,(xo) _ }an even number if m is even, 
7 a Tn sal aoe ie “nr odd number if m is odd. 


Therefore Af(xp)/h approaches no limit as h-0. 


Chapter 8 


1. Hint. The derivatives of e~!* all vanish at x =0 (cf. Chapter 2, Problem 15). 

2. Hint. Use Leibniz’s rule (Theorem 8.12b). 

3. Hint. Use Rolle’s theorem (Sec. 7.41) repeatedly. 

4. Hint. Let x9 be such that |/'(xo)| > M ; ~ €, and use the fact that f(x)—/(xo) ail 
(Xp)(x — x9) — ZMy(x — x9)”. 

5. Ans. 


n(n 


f(x) —nflx+h) + = 
F(x) —lim - 


h-O h" 


L) f(x+2h) — +++ + (—1)"f(x+nh) 


6. Hint. If f" (xo) < 0, then the curve y = f(x) lies below its tangent at x = Xg (see 
Theorem 8.31). Now use Chapter 7, Problem 13. 

7. Hint The presence of a chord lying even partially below the curve leads to the 
existence of a point P = (Xo, f{x%p)) in a neighborhood of which the curve lies 
below its tangent at P. But this is incompatible with the condition f"(x) 7 0. 

8. Hint. Use formula (7), p. 269 to get a lower bound for |sin z|, bearing in mind 
that cosh y 7 sinhy. 

9. Hint. For |y| 7 ¢, use the estimate found in solving Problem 8. For |y| < ¢ bear 
in mind that |sin x| 7 1 — 6. 

10. Ans.@ £107,hS a n2 ae particular, the error made in replacing the 
number e by the sum 

y 1+ T eee t iol 
does not exceed 107’ 
11. Ans. 


12. Ans. No. Use Problem 2 to construct a counterexample. 
13. Hint. Use the inequality 

l “ l 
(n+k)! “nin* 
14. Hint. If e is rational, then n!e is an integer for some n. 


15. Hint. Use formulas (2') and (5'), p. 258, together with the result of Chapter 6, 
Problem 11. 


16. Hint. The inequality (3) certainly holds for all n = 1, 2, ... in some deleted 
neighborhood of x = 0 (cf. Chapter 6, Problem 6). If the inequality fails to hold 


for all x # 0, there is a smallest xy 7 0 at which it leads to an equality. Now apply 
Rolle’s theorem (Sec. 7.41) to the interval [0, x9]. 
17. Hint. Letn =1,x=7/2. 


18. Ans flx+de) = $l apt 


(k=1,2,...)- 


qd"*! (c). 


* Ga! 


Chapter 9 


; T 
1. Hint. The slope of the linear component is z \, fle) dé. 
2. Hint. Regard the right-hand side as a continuous function of ¢, assuming first 

that f(x) does not change sign. 

3. Hint. If g(x) is nondecreasing, say, then the nonnegative function g(x) — @ (a 
+ 0) is also nondecreasing. Now use the result of Problem 2. 

4. Hint. Use mathematical induction. Alternatively, replace b by x, differentiate 
with respect to x, and verify that this gives an identity. Then return to the 
original formula by integration, verifying that the two sides coincide for x = a. 

5. Hint. Suitably generalize the proof of Theorem 9.14e. 

6. Hint. Use Riemann’s criterion (Problem 5). 

7. Hint. Use the inequality ||Kx’)| — |\fx")|| S |x’) — fx")| and Riemann’s 
criterion. 

8. Hint. Use Riemann’s criterion. 

9. Hint. Let ¢ = 1, 4, 3,... in Du Bois-Reymond’s criterion (Problem 8), and 
correspondingly write 6/2, 6/4, 6/8,... instead of 0. 

10. Hint. Use Lebesgue’s criterion (Problem 9). 

11. Hint. The ratio of the left and right-hand sides of the inequality 


Qn(Qn—2)---4-2 |? 1 Tm_[ 2n(Qn—2)-.-4-2 ra 
(2n—1)(2n—3)---3-1| Qnt+1 2 —| (Qn—1)(2Qn—3)---3-1 | Qn 


approaches | as n— o. 

12. Hint. Given any ¢ 7 0, use the uniform continuity of f(x) on [A, B] to find a 6 
> 0 such that |x’ — x”"| < 6 implies |x") — f(x")| < e. Then d(1) * 6 implies 
J (Sk) (%e4 1 — Xe) = | S (x) de t+ &l%q 41 — Xl 


where |e, ee 
13. Hint. If 


rn—-i 


x Xe 1%; 


k=0 


fails to be bounded for all n = 1, 2, ..., then the sum of the errors made in 


replacing f(¢,.) +1 ~ Xx) by the integrals \ fix) de (k=0,1,...2—1) 


Xk 
can become arbitrarily large. If f(x) has a discontinuity point c, we can choose 
arbitrarily many pairs of consecutive points x,, x, 4; on opposite sides of c. 


14. Hint. To verify (2) for an arbitrary piecewise constant function f(x), use 
Conditions a—d. To verify (2) for an arbitrary piecewise continuous function f(x), 
approximate f(x) by a piecewise constant function, using Condition e. 

15. Hint. Set €= 1/\/3 in formula (6), p. 352. 


Chapter 10 


1. Hint. Suppose f(z) is analytic at z= 1. Then 


co (n) 
f= PEM (e-1)" — (le-11<), 


n=0 


where 


fll) =limflz)= Yay, 


SOL) = Vo k(k=1)--(k- + Nay 


by Chapter 6, Problem 8, while 


oo 
fie!) = ? a,e*, 
k=0 


f™ (e%) = y k(k—1)++-(k—n+1)a,e'*, 
k=0 


by Abel’ S a o. (Sec. 6.68) and Chapter 7, Problem 15. It follows that the 
series roe f ce ) (2—ei9)" 
is convergent for |z — e| < 
2. Hint. Start from the formula 
Xen? a = Yb, am 
where lim m 4/ Ta, | ale 
3. Hint. Examine the mapping w = f(z) in a neighborhood of the point zo. 
4. Hint. Apply the maximum modulus principle to the function f(z)/z. 
5. Hint. To prove the required inequality 


Inr,—In71, In 
Inr,—In7, 


In f —In In 


In M(r,) < - 


M (rs) + F M(r,) 


(cf. Chapter 7, Problem 7), consider the function z“f (z), where a@ is such that 
riM(r,) =13M(r3). 

6. Hint. See Secs. 10.16 and 10.17. 

7. Hint. Verify the assertion separately for the functions az, az + b, and 1/z, and 
then use Sec. 10.55. 

8. Hint. Use formula (20), p. 401 to estimate |an|. 

9. Hint. Calculate the increment of the argument in traversing a small circle 
about each pole and zero. 
10. Hint. See Sec. 10.44. 
11. Hint. Let 


oo (k) a) 
ee ‘i (2—4)". 


Then 


60 a k 

lez )+- + gmt z)|= Die grey a) ++ + gi tat(g)) ca 

12. Ans. Each sheet of the Riemann surface of the function (2) consists of the 
whole w-plane cut along any curve joining the points —1 and +1, e.g., along the 
interval [—1, 1] or along the part of the real axis complementary to [—1, 1]. The 
Riemann surface is two-sheeted, with the upper edge of the cut on the first sheet 
being pasted to the lower edge of the cut on the second sheet, and vice versa. 
The branch points in the w-plane are at w= +1. The domains of univalence in the 
z-plane are the interior and the exterior of the unit circle |z| = 1. 

As for the function (3), each sheet of the Riemann surface consists of the w- 
plane cut along the real axis from —o to —1 and from | to o. The Riemann 
surface is infinite-sheeted, with the upper edge of the left half of the cut on the 
kth sheet being pasted to the lower edge of the left half of the cut on the (k + 1)st 
sheet and the upper edge of the right half of the cut on the Ath sheet being pasted 
to the lower edge of the right half of the cut on the (A — 1)st sheet. The branch 
points in the w-plane are at w = +1, 0. The domains of univalence in the z-plane 
are the strips kn <x (K+ 1) a, k=0, +1, +2, ... 

13. Hint. Use the representation of cos z in terms of exponentials. 
14. Hint. Consider the function c,f\(z) + -*: + ¢,f,(z), where the numbers c, are 


such that |cx| = 1 and 


Cif (Zo) H+ Cafn(Zo) = iol ++ VnZo)| 

at the point z0 where |f,(z)| +...+ |f,(z)| has its maximum. 
15. Hint. See Sec. 10.49d. 

16. Hint. Consider 1/P(z). 

17. Hint. Note that 


P'(z) _n | g(z) 


P(z) z 2’ 

where g(z) is bounded as z > oo. Now integrate this formula along a circle of 
sufficiently large radius. 

18. Hint. If P(z) does not vanish, then 1/P (z) achieves its maximum at a finite 
point. 

19. Hint. On the one hand, 


if © «- 
x r.t— 


Where the sum is over all ie of (z) inside J*,, while on the other hand, 


1f Mg-2 “sf £0 424 SO x 
myr, ug 


Qni tr (—z 2 ¢ 2ni J 6(C—z) 
z SO)» 


+54 Qnif, (C—z)-” 


where the last integral approaches 0 as n > o. 

20. Hint. Concerning finite products, see Chapter 6, Problem 10. To get (5), 
integrate (4) and then take exponentials. 

21. Hint. Use Chapter 8, Problem 9. To get the first expansion, let g(z)= sin z 


in (5). 

22. Hint. Consider the function 

fzje = (A < 2/2a) in the sector |arg z| <a, |z| <r, and apply the maximum 
modulus principle. 

23. Hint. Apply Problem 22 to the function f{z)e’” (y gt; 0) in each of the two 
quadrants forming the upper half-plane, obtaining the estimate |f(z)e!””| S max 
{C, M,}, where 

M, =max Ae, 


r 


If M, > C, then |f(z)| has a maximum at a point of the imaginary axis, which is 
impossible if f(z) # const (see Problem 3). Therefore M, S C, |fz)| SC |e". 
Now let y—0. 

24. Hint. Apply Problem 23 and Liouville’s theorem. 


Chapter 11 


1. Hint. Prove the formula for an “inner triangle” on which the integrand is 
continuous, and then pass to the limit. 
2. Hint. Note that 


= v= |e és |" as 


x ad bd 


bd 4 

-| I) dea f(e) in” (0<6, a<b, ad<&<bd), 
ad OX a 

where the last step involves a simple generalization of Theorem 9.15f. 

3. Ans. Using a contour of integration L made up of a segment of the real axis, 
an arc of a circle centered at the origin, and a segment of the ray 0 = 4 (so that L 
is the boundary of a circular sector), we get [(p) = el /p for a suitable value of 
Vb. 

4, Ans. Using Problem 3, we find that F, = F, = 7/7/2. 

5. Ans. 

P(x) sin x 
Q(x) x’ 
where the sum is over all the zeros of Q(x) in the upper half-plane. 


6. Hint. The integral with the infinite limit is not uniformly convergent in the 
parameter f. 


7. Hint. Let a approach 0, appealing to Theorem 11.53. 
8. Hint. Make the substitution x = a cos* 9+ b sin? 6 


9. Hint. Denoting the integral by J and setting x = 2t, we get 
rx/4 n/4 n/4 

1=2|  Insin 21 dt =7in 242 | Insin 1 dt+2 | In cos ¢ dt. 

Jo Jo 0 


But the substitution ¢ = (2/2) —u reduces the last term on the right to 


I=2ni ¥' Res 


n/2 
2| In sin u du=2I. 


n/4 


10. Hint. First note that 


where the integrals of the second and third terms on the right cancel each other 
out (after an obvious substitution). Now use Problem 2. 
11. Hint. Use integration by parts (Sec. 11.15a). 


12. Ans. 1 
n i 2m+1 
2n " 


13. Hint. Compare the values of s*_! on the interval [e, 00) traversed first in one 
direction and then in the other. 


Index 


Abel-Dirichlet test 
for improper integrals, 441-443, 466 
for series, 205—209 

Abel-Liouville theorem, 305n 

Abel’s theorem, 217 

Abel’s transformation, 205 


Absolute value 
of a complex number, 166 
of a real number, 10 


Addition 
associativity of, 3 
commutativity of, 3 
Addition axioms, 3 
consequences of, 4—5 
Additivity condition, 484 
Algebraic number, 33, 34 
Alternating series, 194 
Analytic continuation, 262, 402—403 


Analyticity 

at a point, 374 

on a set, 374 
Analytic function(s), 374 

infinite differentiability of, 395 

power series expansion of, 395 

real, 403 

uniqueness theorem for, 401 

zero (root) of, 400 

order (multiplicity) of, 400 

Angle between two vectors, 168 
Antiderivative, 292 
Arc length, 289-291, 328-335 

of a catenary, 331 

of a circle, 331-332 

of an ellipse, 332 

as parameter, 334—335 
Archimedes, In 

principle of, 16 

multiplicative version of, 16 


Area 

behavior of, under x-fold expansion, 32 1—322 

of a catenoid, 342 

of a circular disk, 320 

of a circular sector, 321 

of a conical band, 338 

over a curve, 318 

under a curve, 288, 317 

enclosed by an ellipse, 322 

of a geometric figure, 287 

in terms of line integrals, 366-368 

of the neighborhood of a curve, 346 

in polar coordinates, 327 

of a sphere, 341 

of a surface of revolution, 339 
Argument of a complex number, 166 

principal value of, 425n 
Argument of a function, 49, 166n 
Argument principle, 428 
Arithmetic mean, 24 
Arithmetic nth power of a set, 26 
Arithmetic product of two sets, 25 
Arithmetic sum of two sets, 25 
Asymptote, 130, 248 
Asymptotic unit, 116 


Automorphisms 
of n-dimensional space, 41—43, 170 
of a structure, 35 

Axioms, complete system of, 36 


Baire’s theorem, 85 


Ball(s) 
closed, 55, 76n 
open, 55, 61n 


principle of nested, 85 

radius of, 55 
Banach, S., 1x 
Basis, 41 
Beta function, 469 

in terms of the gamma function, 471 
Binary system, 21 
Binomial coefficients, 249, 313 
Birkhoff, G., 177n, 178n 


Block 

closed, 134 

open, 134 
Bolzano’s theorem, 140 
Bolzano-Weierstrass principle, 75 

for sequences, 75 
Boundary of a set, 486 
Bounded function, 108 
Bounded set, 15 

in a metric space, 54 
Bourbaki, N., 1x, 35n 
Branch point, 427 


Cantor, G., 21, 33 
Cartan, H., x 

Catenoid, 342 

Cauchy, A. L., 190, 435 


Cauchy convergence criterion 
for a function with values in a metric space, 106 
for improper integrals, 432, 461 
for a numerical sequence, 81, 123 
for a numerical series, 187 
for a sequence of functions, 182 
for a series of functions, 212 
for a vector function, 129 
for a vector series, 204 
Cauchy-Hadamard theorem, 213 
Cauchy-Riemann equations, 378 
Cauchy sequence, 80 
Cauchy’s formula, 394 
Cauchy’s inequality, 58 
Cauchy’s mean value theorem, 238 
Cauchy’s test, 189 
Cauchy’s theorem 
on complex integration, 389 
Cells, 484 
measure of, 484 
Characteristic function, 486 
Chebotarev, N. G., 305n 
Circle of convergence, 214 
Circular functions, 270 
Closed ball, 55, 76n 
Closed block, 134, 485 
Closed interval, 15, 23 
Closed path, 365 
Closed set, 76 
Closure, 78 
Cofinal sequence, 87 
Compact metric space, 91 
Compactum, 91 
measure and integration on, 484-488 
weighted, 484 


Comparison test 
for improper integrals, 433 
for series, 188 
Complement of a set, 28 
Complementation, 28 
Complete metric space, 81 
Complete system of axioms, 36 
Completion (of a metric space), 87 
Complex conjugates, 46 
Complex function(s), 173, 373 
analytic, 374 
continuous, 173 
differentiable, 373 
line integrals of, 381-383 
singularity of, 400 
singular point of, 400, 445-446 
Complex mth roots, 168 
Complex number(s), 43 
absolute value of, 166 
argument of, 166 
conjugate, 46 
exponential form of, 264 
imaginary part of, 46 
modulus of, 166 
multiplication of, 166 
real part of, 45 
trigonometric form of, 166, 264 
Complex number system, 45 
automorphisms of, 47 
uniqueness of, 46 
Complex plane, 45, 373 
extended, 412 
Component of an open set, 62 
Components of a vector, 39, 41 


Concavity 
at a point, 240 
tests for, 241, 242, 255, 256 
on a set, 241 
of sin x, cos x, 161n, 242 
Condensation point, 97 


Cone 
area of, 336-337 
directrix of, 336 
generator of, 336 
pyramidal surface inscribed in, 336 
vertex of, 336 
Conformal mapping, 417 
Congruent sets, 60, 486 


Conical band 

area of, 338 

slant height of, 338 
Conical surface (see Cone) 
Constant of integration, 295 


Continuity 
on compacta, 136 
from the left, 141 
modulus of, 138 
one-sided, 141 
at a point, 132, 172, 173 
from the right, 142 
on a set, 132 
uniform, 137 
Continuity point, 132 
Continuum, 33 
set with power of, 34 
Contour, 366 
Convergent sequence, 63 
boundedness of, 66 
limit of, 63 
Convergent series, 186 
Convexity, 246-247 
Convolutions, 463-467 
Cosecant, 165 
Cosine, 158 
derivative of, 230 
Cotangent, 165 
Countable set, 31 
Covering, 93 
Curve(s), 290, 329 
angle between two, 417n 
closed, 365, 389 
exterior of, 389 
interior of, 389 
length of, 291, 328-331 
multiple point of, 389 
neighborhood of, 345 
parametric, 329 
piecewise smooth, 329 
polygonal line inscribed in, 290, 329 
simple, 389 
singular point of, 335 
Curvilinear trapezoid, 288, 487 


D’Alembert’s test, 188 

Darboux’s theorem, 246 

Decimal representation of a real number, 19 
digits of, 19 

Decreasing function, 142 

Dedekind cut, 26 


Definite integral, 293 
Degree (unit of angular measure), 160 
Deleted neighborhood, 101, 240n 


Dense set 

relative to a set, 78 

in a set, 78 
Dependent variable, 49 
Derivative(s), 223 

of a complex function, 373 

higher, 376 

of a composite function, 226 

higher, 249 

of an inverse function, 227 

left-hand, 231 

one-sided, 232 

of order n, 249 

right-hand, 231 

rules for calculating, 223-231 

second, 249 
Determinant, derivative of, 226 
Diameter of a set, 54 
Dieudonne, J., ix, 145 
Difference of two real numbers, 5 
Difference quotient, 223, 373 


Differentiability 
in the complex domain, 373 
continuous, 249 
of a function of two variables, 428 
at a point, 223, 373 
on a set, 243n, 374n 
uniform, 246 
Differential, 236, 376 
of a composite function, 237 
invariance property of, 237, 296n, 377 
of orderw, 272 
Differentiation, 223 
of a composite function, 226 
of an inverse function, 227 
of the logarithm and related functions, 227—229 
n-fold, 249 
tules for, 223-231 
of a sequence of functions, 353-354 
of a series of functions, 354-355 
of trigonometric functions and their inverses, 229—231 
Digits, 19 
Direct product, 49 
equality of elements of, 49 
of metric spaces, 60 
metrization of, 60 
section of, 49 
Direction, 99 
Dirichlet function, 285, 350 
nonintegrability of, 285 
Dirichlet’s formula, 479 
Dirichlet’s theorem, 200 
Discontinuity point, 132 
Distance in a metric space, 53 
from a point to a set, 97 
symmetry of, 53 
Divergent sequence, 63 
Divergent series, 186 
Domain, 374 
multiply connected, 392 
simply connected, 389 
Domain of definition, 49 
Domain of univalence, 425 
DuBois-Reymond’s criterion, 371 


e, definition of, 124 
E, definition of, 115 


Elementary function, 305n 
Elliptic integral(s), 305, 332, 333 
Legendre’s incomplete, of the first and second kind, 306n 
modulus of, 306 
of the third kind 307 
Empty set, | 
End point, 15 
left-hand, 15 
right-hand, 15 
Entire function, 400 
é-net, 93 
Equivalent sets, 29 
Essential singular point, 412 
behavior at, 414-416 
at infinity, 413 
Euclidean vs. non-Euclidean geometry, 3n 
Euler, L., 221, 437, 457 
Euler’s constant, 437 
Euler’s formula, 263 
Expansion, 43 
Exponential (s), 149-157 
to the base a, 149 
complex, 262, 420-421 
definition of, 149 
derivative of, 228 
Exponential integral, 308 
Extended complex plane, 412 
Extended real line, 23 
Extended real number system, 22 
Extension of a function, 103n 


Fichtenholz, G. M., 124n, 179n, 300n, 307n, 308n 
Field, 4 
Field of complex numbers, 45 
Finite covering theorem, 96 
Finite difference formula, 239 
Finite numbers, 23 
Finite points, 412 
Fourier integrals, 451-457 
Fractional linear function, 419 
Fractional part, 16 
Fresnel integrals, 479 
Frobenius, G. 48 
Frullani’s formula, 479 
Function(s), 48 

analytic (see Analytic function) 


argument of, 49 
belonging asymptotically to a set, 107 
bounded, 108 
in a direction, 109 
bounded from above, 108 
in a direction, 109 
bounded from below, 108 
in a direction, 109 
characteristic, 486 
of a complex variable, 173, 373 
composite, 135 
continuity of, 135 
derivative of. 226, 375 
differential of, 237 
concave downward, 240 
concave upward, 240 
continuous, 132, 172, 173 
on compacta, 136 
from the left, 141 
piecewise, 283 
at a point, 132 
from the right, 142 
on a set, 132 
of two variables, 138 
uniformly, 137 
continuously differentiable, 249 
convex, 246 
decreasing, 142 
defined on a set, 49n 
derivative of, 223, 373 
differentiable, 223 
continuously, 249 
infinitely, 259n 
properties of, 233-235 
uniformly, 246 
differential of, 236 
Dirichlet, 285, 350 
nonintegrability of, 285 
domain of definition of, 49 
equivalent, in a direction, 115 
even, 319n 
extension of, 103n 
extremum (local) of, 235 
fractional linear, 419 
graph of, 50 
harmonic, 381 


hyperbolic, 267-270 

inverse, 304 
increment of, 235 
increasing, 142 
infinitely differentiable, 259n 
inflection point of, 241 
integrable, 276, 484 

absolutely, 438n 
integral of, 274, 484 
integral mean value of, 280 
inverse, 143, 144 

derivative of, 227 
linear, 416 
limit point of, in a direction, 118 
lower limit of, in a direction, 117 
maximum (local) of, 235 
minimum (local) of, 235 
monotonic, 142 
multiple-valued, 50 
nondecreasing, 142 

in a direction, 120 
nonincreasing, 142 

in a direction, 121 
nonnegative, in a direction, 109 
numerical, 49 
odd, 319n 
one-to-one, 49 
periodic, 161, 265 
piecewise constant, 284 
piecewise continuous, 283 
piecewise smooth, 296 
positive, in a direction, 109 
range of, 49n 
real, 49 
real analytic, 403 
of a real variable, 49 
restriction of, 103n 
sequence of (see Sequence of functions) 
series of (see Series of functions) 
single-valued, 50 
smooth, 249 
sum of, 108 
trigonometic (see Trigonometric functions) 
uniformly continuous, 137 
univalent, 419 
upper limit of, in a direction, 118 


value of, 49 

vector, 49 
Fundamental sequence (s), 80 

cofinal, 87 
Fundamental theorem of algebra, 174, 176, 429 
Fundamental theorem of calculus, 293 


Gamma function, 467 
asymptotic representation of, 477 
complementation formula for, 474 
in the complex domain, 478-479 
functional equation for, 469 
graph of, 474 

Gauss, G. F., 130, 218 

Geometric mean, 24 

Geometric series, 186-187 

Goursat, E., ix 

Graph, 50 

Greatest common divisor, 178 

Greatest lower bound, 13 

Grushin, V. V., 51n 

Gurevich, B. L., 285n, 317n, 488n 


Hadamard, J., 427 
Half-closed interval, 15 
Half-open interval, 15 
Hardy, G. H., 308n 
Harmonic function (s), 381 
conjugate, 381 
Harmonic mean, 192 
Harmonic series, 191 
Hausdorff’s criterion, 93 
Hausdorff’s theorem, 87 
Heine’s theorem, 137 
Hermite’s theorem, 124 
Hirsch, K. A., 48 
Holomorphic function, 374 
Homeomorphic metrics, 68 
Homeomorphic spaces, 67 
Homeomorphism, 67 
Hyperbolic cosine, 267 
Hyperbolic functions, 267-270 
inverse, 304 
Hyperbolic sine, 267 
Hyperelliptic integral, 305 
Hypergeometric series, 218 


i, definition of, 45 
Identity transformation, 42 
Image, 417 
Imaginary axis, 46 
Imaginary part, 46 
Imaginary unit, 45 
Improper integral (s) 
absolutely convergent, 438 
Cauchy convergence criterion for, 432, 461 
comparison test for, 433 
conditionally convergent, 438 
convergence of, 438-443 
evaluation of, by residues, 448-457 
of the first kind, 431 
convergent, 431 
divergent, 431 
value of, 431 
integration by parts for, 437 
parameter-dependent, 457—467 
of the second kind, 443 
convergent, 443 
divergent, 443 
value of, 443 
of the third kind, 445-447 
convergent, 446 
value of, 446 
Inclusion relations, | 
Increasing function, 142 
Increment, 235, 376 
principal linear part of, 236, 376 
Indefinite integral, 293 
Independent variable, 49 
Inequalities, 4, 9-12 
Infinite products, 220-221, 430 
value of, 220 


Infinity 
minus, 23 
plus, 23 
point at, 412 
Inflection point, 241 
tests for, 242, 256, 257 
Integer(s), 6 
even, 6 
odd, 6 
positive, 6 
Integrable majorant, 462 
Integral multiples, 16 
Integral part, 16 
Integral sign, 278n 
Integral test, 435 
Integral(s), 274 
axiomatic definition of, 372 
of the Cauchy type, 388 
definite, 293 
as a function of its upper limit, 291 
general properties of, 278-280 
improper (see Improper integral) 
indefinite, 293 
line (see Line integrals) 
over an n-dimensional block, 485 
parameter-dependent (see Parameter-dependent integrals) Riemann, 274 
Stieltjes, 316 
over a weighted compactum, 484 
Integrand, 278n 


Integration 
over a compact metric space, 484-488 
constant of, 295 
of irrational expressions, 302—305 
lower limit of, 275 
with respect to a parameter, 356, 459 
by parts, 295, 309, 437 
repeated, 371 
by rationalizing substitutions, 300-301 
of a sequence of functions, 349-351 
of a series of functions, 351-353 
by substitution, 295—296, 313-316, 437-438 
upper limit of, 275 
variable of, 275 
Interior point, 55, 61, 136, 374 
Intermediate value theorem, 140 
Intersection, 2, 27 
Interval (s), 15 
adjacent, 77 
closed, 15, 23 
component, 62 
deleted, 101 
end point of, 15 
left-hand, 15 
right-hand, 15 
half-closed, 15 
half-open, 15 
open, 15, 23 
Inverse cosine, 163 
alternative definitions of, 163 
derivative of, 230, 231 
Inverse (function), 143, 144 
graph of, 144 
Inverse hyperbolic cosine, 304 
geometric meaning of, 324 
Inverse hyperbolic functions, 304 
Inverse hyperbolic sine, 304 
Inverse image, 417 
Inverse sine, 162 
derivative of, 230 
Inverse tangent, 165 
derivative of, 231 
Inverse trigonometric functions, 162-165 
Irrational numbers, 6, 34 
Isolated point, 86 
Isolated singular point(s), 409 


classification of, 411-412 
Isometry, 59 

linear, 59 
Isomorphic sets, 35 
Isomorphic structures, 35 


Jordan curve theorem, 389 
Jordan measure, 486 
Jordan sets, 486 
congruent, 486 
Jordan’s lemma, 453 


Kronecker delta, 170n 
Krylov, A. N., 265n 
Kurosh, A. G., 48n 


Lagrange’s theorem, 238 
Landau, E., x 
Laplace’s equation, 381 
Laurent coefficients, 410 
Laurent expansion, 409n 
Laurent series, 409 
principal part of, 411 
regular part of, 411 
Least upper bound, 4 
Least upper bound axiom, 314 
consequences of, 12—15 
Lebesgue integral, 285 
Lebesgue measure, 488 
Lebesgue’s criterion, 371 
Left-hand derivative, 231 
Left-hand tangent, 232 
Leibniz’s rule, 249 
Leibniz’s test 
for improper integrals, 438 
for series, 194, 207 
Length of a curve, 291, 328-331 
L’Hospital’s rule 
for 00/00, 244 
for 0/0, 243 


Limit point 
of a function, 118 
of a sequence of numbers, 124 
of a sequence of points, 73 
of a set, 74 
Limit(s), 63 
in a direction S, 99 
lower, 117, 125 
as n — 0, 99 
of numerical functions, 108-126 
one-sided, 141 
partial, 102 
uniqueness of, 66 
upper, 118, 125 
asx — a, 101 
as x > +00, 100 
as x > — 00, 100 
as |x| — 00, 100 
Limits of integration, 275 
Line integrals, 361-370 
along a closed contour, 365-366 
area in terms of, 366-368 
of complex functions, 381-383 
Linear function, 416 
fractional, 419 
mapping by a, 417 
Linear inequalities, 135 
Linear subspace(s), 222n 
orthogonal, 222n 
Linearly dependent vectors, 40 
Linearly independent vectors, 40 
Linearly ordered set, 35 
Liouville’s theorem, 401 
Liusternik, L. A., 93 
Lobachevskian geometry, 3n 
Local extremum, 235 
Local maximum, 235 
tests for, 242, 255 
Local minimum, 235 
tests for, 242, 255 
Locally compact metric space, 91 
Logarithm (s), 145-149 
to the base a, 147 
to the base e, 155 
complex, 421—422 
continuity of, 148 


definition of, 147 
derivative of, 228 
natural, 155 
Logarithmic integral, 308 
Logarithmic residue, 408 
Lower bound, 12 
greatest, 13 


Lower limit 
of a function, 117 
of a sequence, 125 


MacLane, S., 177n, 178n 
Mapping, 48 

conformal, 417 

one-to-one, 29 

of Ro into itself, 169 
Markushevich, A. I., 415n, 424n 
Mathematical induction, method of, 6 
Mathematical structure(s), 35 

automorphisms of, 35 

isomorphic, 35 

uniqueness of, 36 
Maximum modulus principle, 427 
Maximum of two real numbers, 10 
Mean value of an integral, 280 
Mean value theorem for integrals, 280, 384 
second, 370 
Mean value theorems, 237—240 


Measure 
Jordan, 486 
Lebesgue, 488 
Method of undetermined coefficients, 179 
Metric(s), 53 
homeomorphic, 68 
Metric space(s), 53 
compact, 91 
complete, 81 
completion of, 87 
definition of, 53 
direct product of, 60 
homeomorphic, 67 
isometric, 59 
isomorphism of, 59 
locally compact, 91 
metric (or distance) of, 53 
precompact, 92 
m-gon 
closed, 135 
open, 135 
Minimum of two real numbers, 10 
Minus infinity, 23 


Modulus 

of a complex number, 166 

of a real number, 10 
Modulus of continuity, 138, 276 
Monotonic function, 142 
Multiple point, 389 


Multiplication 
associativity of, 4 
commutativity of, 4 
distributivity of, 4 
Multiplication axioms, 3 
consequences of, 5—9 
Multiplicity (order) of a root, 177 


Natural numbers, 6 
n-dimensional Euclidean space, 56 
n-dimensional (real) space, 39, 56 
addition of elements in, 39 
multiplication of elements in, by real numbers, 39 
points of, 39n 
coordinates of, 39n 
vectors in, 39 
components of, 39 
Negative (element), 3 
uniqueness of, 5 
Negative number, 10 
Neighborhood, 55, 240n 
of a curve, 345 
area of, 346 
deleted, 101, 240n 
one-sided, 141n 
Nemirovski, A. S., 218 
Nested intervals, system of, 21 
Nested subsets, system of, 83 
Newton’s law of gravitation, 343 
n factorial, 469 
n-leaved rose, 327 
Nondecreasing function, 142 
in a direction, 120 
Nonincreasing function, 142 
in a direction, 121 
Nonnegative number, 10 
Nonpositive number, 10 
Norm of a vector, 56 
Normalized vector, 56 
nth power of a real number, 7 
nth root 
of a complex number, 420 
of a real number, 13 
Number field, 4 
Numerical series, 186 


o, definition of, 109, 115 
O, definition of, 109 
One-sided continuity, 141 
One-sided derivatives, 232 
One-sided limits, 141 
One-sided neighborhoods, 141n 
One-to-one correspondence, 29, 49 
One-to-one function, 49 
One-to-one mapping, 29 
Open ball, 55, 61n 
radius of, 55 
Open block, 134 
Open interval, 15, 23 
Open set, 61 
in the complex plane, 374 
component of, 62 
Order axioms, 3, 4 
consequences of, 9-12 
involving addition, 9-11 
involving multiplication, 11—12 
Orthogonal transformation, 170 
Orthogonal vectors, 169 


Oscillation of a function 
on an interval, 371 
at a point, 371 
Osculating parabola, 255 


Palamodov, V. V., 51n 
Parameter-dependent integrals, 355-361 
differentiation of, 357-361 
improper, 457—467 
analyticity of, 461 
differentiation of, 357-361 
integration of, with respect to the parameter, 459 
uniform convergence of, 458 
integration of, with respect to the parameter, 356-357 
Partial derivative, 357n, 377 
Partial fraction expansions, 178, 429 
Partial limits, 102 
Partial products, 220 
Partial sums, 186, 203 
symmetric, 210 
two-sided, 209 
Partition, 274 
finer, than another partition, 276 
with marked points, 274 
parameter of, 274 
refinement of, 275 
P-ary system, 21 
Path, 361 
closed, 365 
Period, 161, 265 
Periodic function, 161 
complex, 265 
Phragmen-Lindelof theorem, 430 
Picard, E., ix, 415 
z, definition of, 160 
Plus infinity, 23 
Point at infinity, 412 
as a limit, 412 
as a limit point, 413 
Points of subdivision, 274 
Polar angle (s), 165, 172 


Polar coordinates 
area in, 327 
in the plane, 166 
in space, 172 
Polar distance, 166n 


Pole 
of an analytic function, 400 
at infinity, 413 
of order m, 412 
Polya, G., 429 
Polygonal line, simple closed, 287 


Polyhedron 
closed, 135 
open, 135 
Polynomial (s) 
continuity of, 174 
differentiation of, 225, 375 
factorization of, 176 
greatest common divisor of, 178 
integration of, 297, 386 
long division of, 177n 
dividend (divisor, quotient, remainder) in, 177 
with real coefficients, 179 
root (zero) of, 174 
multiplicity (order) of, 177 
Positive integers, 6 
Positive number, 10 
Power of sets, 30 
Power series, 213-218 
analyticity of, 398 
circle of convergence of, 214 
radius of convergence of, 214 
real, 213n 
region of convergence of, 213 
Precompact metric space, 92 
Principal linear part, 236 
Principal part of a Laurent series, 411 
Principle of Archimedes, 16 
Principle of nested balls, 85 
Principle of nested intervals, 21 
Punctured disk, 392, 411 
Purely imaginary number, 46 


Quadrilateral inequality, 53 


Raabe’ stest, 193 

Radius of convergence, 214 

Radius vector, 166, 172 

Range of a function, 49n 

Ratio of similitude, 43 

Ratio test, 189n 

Rational function (s) 
continuity of, 139, 174 
differentiation of, 225, 375 
Fourier integrals of, 452-457 
integration of, 298-300, 447-451 
limit of, 117 


partial fraction expansion of, 178 
Rational numbers, 6 
Real analytic function, 403 
Real axis, 46 
Real line, 3 
extended, 23 
points of, 3 
Real number(s), 3 
decimal representation (expansion) of, 19 
finite, 23 
product of, 3 
quotient (ratio) of, 6 
sum of, 3 
Real number system, 3 
automorphisms of, 3 
extended, 22 
uniqueness of, 36 
Real part, 45 


Reciprocal 

of a complex number, 44 

of a real number, 4 

uniqueness of, 6 

Reflection, 60 
Region of convergence, 213 
Regular function, 374 
Regular part of a Laurent series, 411 


Relation 
reflexive, 30 
symmetric, 30 
transitive, 30 
Remainder in Taylor’s formula, 253 
in integral form, 310 
Lagrange’s form of, 253, 311 
Removable singular point, 412 
at infinity, 413 
Repeating decimal, 25 
Residues, 405—408 
logarithmic, 408 
use of, to evaluate improper integrals, 457-467 
Residue theorem, 407 
Restriction of a function, 103n 
Riemann integral, 274, 484 
Riemann’s criterion, 371 
Riemann sum, 274, 484 
generalized, 372 
limit of, 275, 484 
Riemann surfaces, 423—427 
Riemann’s theorem, 201 
Riemann-Stieltjes sum, 316 
Right-hand derivative, 231 
Right-hand tangent, 232 
Rolle’s theorem, 237 
Root (zero) of a polynomial, 174 
multiplicity (order) of, 177 
Root test, 189n 


Rotation 
in Ry through angle 6, 169 


in Rp, 170 
Rudin, W. 13 


Scalar product, 56 
Schwarz inequality, 57 
Schwarz’s lemma, 427 
Secant (function), 165 
Section of a direct product, 49 
Section of a series, 186 
Semitangent, 232n 
Sequence, 49 
bounded, 123 
from above, 123 
from below, 123 
Cauchy, 80 
Cauchy convergence criterion for, 81, 123 
cofinal, 87 
convergent, 63 
divergent, 63, 123 
to — 00, 123 
to + 0, 123 
limit of, 63, 123 
limit point of, 73, 124 
lower limit of, 125 
nondecreasing, 119, 123 
nonincreasing, 123 
subsequence of, 74 
upper limit of, 125 
Sequence of functions, 179-183 
Cauchy convergence criterion for, 182 
convergent, 180 
differentiation of, 353-354 
integration of, 349-351 
limit of, 180 
uniformly convergent, 181, 350 
inside a domain, 396 
Series, 186-222 
absolutely convergent, 194 
alternating, 194 
Cauchy convergence criterion for, 187 
complex, 205 
product of, 205 
conditionally convergent, 194 


convergent, 186 
divergent, 186 
of functions (see Series of functions) 
geometric, 186 
grouping of terms of, 198 
harmonic, 191 
hypergeometric, 218 
infinite, 186n 
Laurent (see Laurent series) 
numerical, 186 
one-sided, 209 
operations on, 196—203 
partial sums of, 186 
positive, 186 
power (see Power series) 
product of, 197 
with a number, 203 
rearrangement of, 199-203 
section of, 186 
sum of, 196 
Taylor (see Taylor series) 
two-sided (see Two-sided series) 
of vectors (see Series of vectors) 
Series of functions, 211—213 
Cauchy convergence criterion for, 212 
convergent, 212 
differentiation of, 354-355 
integration of, 351-353 
sum of, 212 
uniformly convergent, 212 
inside a domain, 396 
Series of vectors, 203-211 
absolutely convergent, 204 
Cauchy convergence criterion for, 204 
complementary parts of, 222 
conditionally convergent, 204 
convergent, 203 
divergent, 203 
partial sums of, 203 
sum of, 203 
vector of absolute convergence of, 221 
vector of absolute divergence of, 222 
Set(s), 1 
boundary of, 486 
bounded, 15 
in a metric space, 54 


bounded from above, 4 
bounded from below, 12 
closed, 76 
closure of, 78 
complement of, 28 
congruent, 60, 486 
connected (arcwise), 374 
convex, 389n 
countable, 31 
dense, 78 
diameter of, 54 
direct product of, 49 
empty, | 
equivalence of, 29-31 
equivalent, 29 
finite, | 
infinite, | 
interior point of, 55, 61 
intersection of, 2, 27 
isomorphic, 35 
Jordan, 486 
Jordan measure of, 486 
of (Jordan) measure zero, 485 
limit point of, 74 
linearly ordered, 35 
lower bound of, 12 
nonintersecting, 2 
pairwise, 493 
open, 61 
operations on, 27—29 
power of, 30 
product of, 27 
of the same power, 30 
subset of, | 
proper, | 
sum of, 27 
unbounded, 54 
uncountable, 33 
union of, 2, 27 
upper bound of, 4 
volume of, 486 
Shilov, G. E., 40n, 41n, 42n, 225n, 285n, 317n, 488n 
Silverman, R. A., 40n, 124n, 179n, 225n, 285n, 289n, 300n, 381n, 415n, 488n 
Similarity transformation, 43 
Sine, 158 
derivative of, 229 


Sine integral, 308, 353 
Single-leaved rose, 400 
Singlularity, 400 


Singular point 

of a complex function, 400, 445-456 

of a curve, 335 

essential, 412 

isolated, 409 

removable, 412 
Sphere in a metric space, 55 
Springer, G., 424n 
Steinitz’s theorem, 222 
Stieltjes integral, 316 
Stieltjes line integral, 362 
Stirling’s formula, 478 
Structure (see Mathematical structure) 
Subdivision, points of, 274 
Subsequence, 74 
Subset, 1 

proper, 1 
Sum of a series, 186 
Surface of revolution, 335 

area of, 339 
Symbolic logic, 482—483 


Tangent (to a curve), 223 
left-hand, 232 
right-hand, 232 

Tangent (function), 163 
derivative of, 230 

Taylor coefficients, 396 

Taylor expansion, 260n 

Taylor series, 260, 396 

Taylor’s formula, 252, 257, 310 
remainder of, 253 

Taylor’s polynomial, 253 

Ternary system, 21 

Transcendental number, 34 


Transformation 
(proper) orthogonal, 170 
of Rp into itself, 169 
Triangle inequality, 53 
Trigonometric form of a complex number, 166 
Trigonometric functions, 157—172 
applications of, 165-172 
in the complex domain, 263 
definition of, 165 
differentiation of, 229-230 
inverse, 162-165 
differentiation of, 230-231 
Two-sided series, 209-21 1 
convergent, 209 
partial sums of, 209 
symmetric, 210 
sum of, 209 
symmetrically summable, 210 
symmetric sum of, 210 


Unbounded set, 54 
Uncountable set, 33 
Uniform continuity, 137 


Uniform convergence 

inside a domain, 396, 460 

of improper integrals, 458, 466 

of a sequence of functions, 181, 350 
of a series of functions, 212 
nion, 2, 27 
niqueness theorem for analytic functions, 401 
nit, asymptotic, 116 
nit element, 4, 44 
niqueness of, 5 
Jnit vector, 56 

nivalent function, 419 
Jpper bound, 4 

least, 4 
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Upper limit 
of a function, 118 
of a sequence, 125 
Uspensky, V. A., 3n 


Van der Waerden, B. L., 248 
Vector of absolute convergence, 221 
Vector of absolute divergence, 221 
Vector function(s), 126 
bounded, in a direction, 127 
Cauchy convergence criterion for, 129 
product of, with a real function, 126 
sum of, 126 
of a vector variable, 172 
Vector(s), 39 
angle between, 168 
component (s) of, 39 
along another vector, 222n 
orthogonal to another vector, 222n 
with respect to a basis, 41 
length of, 56 
linearly dependent, 40 
linearly independent, 40 
negative (opposite) of, 40 
norm of, 56 
normalized, 56 
orthogonal, 169 
projection of, onto a subspace, 222n 
projection of, onto another vector, 222n 
scalar product of, 56 
unit, 56 
Volume, 289 
additivity of, 486 
of an ellipsoid, 349 
of an n-dimensional block, 485 
of a set, 486 
of a solid with cross sections of known 
area, 347 


Wallis’ product, 371 
Weierstrass’ test, 212 
Weierstrass’ theorem 
on continous numerical functions, 136 
on sequences of analytic functions, 397 
w-plane, 373n 


x raised to an arbitrary real power, 151 
xth power of a real number, 149n 


Young’s inequality, 325 


Zero 
of an analytic function, 400 
order (multiplicity) of, 177, 400 
of a polynomial, 174 

Zero element, 3 
uniqueness of, 5 

z-plane, 373n 


